Any idea why code that looks like this
list<Foo> fooList;
processList(&fooList);
Generates the following machine code
lea rax, [rbp-48]
mov rdi, rax
call processList(std::__cxx11::list<Foo, std::allocator<Foo> >*)
lea rax, [rbp-48]
mov rdi, rax
call std::__cxx11::list<Foo, std::allocator<Foo> >::~list()
jmp .L11
mov rbx, rax
lea rax, [rbp-48]
mov rdi, rax
call std::__cxx11::list<Foo, std::allocator<Foo> >::~list()
mov rax, rbx
mov rdi, rax
call _Unwind_Resume
.L11:
add rsp, 40
pop rbx
pop rbp
ret
In particular, I don't see any paths leading to the line after the unconditional jmp .L11
(this is with GCC 6.2 with no optimization, generated on compiler explorer)
For comparison, clang 5.0.0 produces
call processList(std::__cxx11::list<Foo, std::allocator<Foo> >*)
jmp .LBB5_1
.LBB5_1:
lea rdi, [rbp - 24]
call std::__cxx11::list<Foo, std::allocator<Foo> >::~list()
add rsp, 48
pop rbp
ret
lea rdi, [rbp - 24]
mov ecx, edx
mov qword ptr [rbp - 32], rax
mov dword ptr [rbp - 36], ecx
call std::__cxx11::list<Foo, std::allocator<Foo> >::~list()
mov rdi, qword ptr [rbp - 32]
call _Unwind_Resume
Again there is an unconditional jump to a return block, and and unwind block (starting with the second lea rdi) that seems unreachable.
After a bit of research on C++ exception mechanisms, my conclusion is that the process is as follows:
At the point of exception throw, __cxa_throw gets called. This is somewhat like longjmp() in that the function gets called but never returns. The function performs two main tasks
It walks up the call stack looking for a catch. If it doesn't find any, std::terminate gets called.
If it does find a catch block then it calls all of the unwind handlers between the current function and the catch block, then calls the catch block.
Back to my original machine code (with filtering turned off in compiler explorer). My comments after the hashes.
# this is the normative path
call std::list<Handle, std::allocator<Handle> >::~list()
# unconditional jump around the unwind handler
jmp .L11
.L10:
# unwind handler code, calls the local variable destructor
mov rbx, rax
.loc 2 30 0
lea rax, [rbp-32]
mov rdi, rax
call std::list<Handle, std::allocator<Foo> >::~list()
mov rax, rbx
mov rdi, rax
.LEHB1:
# carry on unwinding
call _Unwind_Resume
.L11:
Then there is the exception table
.section .gcc_except_table,"a",#progbits
.LLSDA1386:
.byte 0xff
.byte 0xff
.byte 0x1
.uleb128 .LLSDACSE1386-.LLSDACSB1386
.LLSDACSB1386:
# entry for unwind handler
.uleb128 .LEHB0-.LFB1386
.uleb128 .LEHE0-.LEHB0
.uleb128 .L10-.LFB1386
.uleb128 0
.uleb128 .LEHB1-.LFB1386
.uleb128 .LEHE1-.LEHB1
.uleb128 0
.uleb128 0
I guess that the unwind handler function can work out the positions of the unwind handler blocks from the addresses on the stack and the offsets in this table.
Related
Firstly: This code is considered to be of pure fun, please do not do anything like this in production. We will not be responsible of any harm caused to you, your company or your reindeer after compiling and executing this piece of code in any environment. The code below is not safe, not portable and is plainly dangerous. Be warned. Long post below. You were warned.
Now, after the disclaimer: Let's consider the following piece of code:
#include <stdio.h>
int fun()
{
return 5;
}
typedef int(*F)(void) ;
int main(int argc, char const *argv[])
{
void *ptr = &&hi;
F f = (F)ptr;
int c = f();
printf("TT: %d\n", c);
if(c == 5) goto bye;
//else goto bye; /* <---- This is the most important line. Pay attention to it */
hi:
c = 5;
asm volatile ("movl $5, %eax");
asm volatile ("retq");
bye:
return 66;
}
For the beginning we have the function fun which I have created purely for reference to get the generated assembly code.
Then we declare a function pointer F to functions taking no parameters and returning an int.
Then we use the not so well known GCC extension https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html to get the address of a label hi, and this works in clang too. Then we do something evil, we create a function pointer F called f and initialize it to be the label above.
Then the worst of all, we actually call this function, and assign its return value to a local variable, called C and the we print it out.
The following is an if to check if the value assigned to the c is actually the one we need, and if yes go to bye so that he application exits normally, with exit code 66. If that can be considered a normal exit code.
The next line is commented out, but I can say this is the most important line in the entire application.
The piece of code after the label hi is to assign 5 to the value of c, then two lines of assembly to initialize the value of eax to 5 and to actually return from the "function" call. As mentioned, there is a reference function, fun which generates the same code.
And now we compile this application, and run it on our online platform: https://gcc.godbolt.org/z/K6z5Yc
It generates the following assembly (with -O1 turned on, and O0 gives a similar result, albeit a bit more longer):
# else goto bye is COMMENTED OUT
fun:
mov eax, 5
ret
.LC0:
.string "TT: %d\n"
main:
push rbx
mov eax, OFFSET FLAT:.L3
call rax
mov ebx, eax
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
cmp ebx, 5
je .L4
.L3:
movl $5, %eax
retq
.L4:
mov eax, 66
pop rbx
ret
The important lines are mov eax, OFFSET FLAT:.L3 where the L3 corresponds to our hi label, and the line after that: call rax which actually calls it.
And runs like:
ASM generation compiler returned: 0
Execution build compiler returned: 0
Program returned: 66
TT: 5
Now, let's revisit the most important line in the application and uncomment it.
With -O0 we get the following assembly, generated by gcc:
# else goto bye is UNCOMMENTED
# even gcc -O0 "knows" hi: is unreachable.
fun:
push rbp
mov rbp, rsp
mov eax, 5
pop rbp
ret
.LC0:
.string "TT: %d\n"
main:
push rbp
mov rbp, rsp
sub rsp, 48
mov DWORD PTR [rbp-36], edi
mov QWORD PTR [rbp-48], rsi
mov QWORD PTR [rbp-8], OFFSET FLAT:.L4
mov rax, QWORD PTR [rbp-8]
mov QWORD PTR [rbp-16], rax
mov rax, QWORD PTR [rbp-16]
call rax
mov DWORD PTR [rbp-20], eax
mov eax, DWORD PTR [rbp-20]
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
cmp DWORD PTR [rbp-20], 5
nop
.L4:
mov eax, 66
leave
ret
and the following output:
ASM generation compiler returned: 0
Execution build compiler returned: 0
Program returned: 66
so, as you can see our printf was never called, the culprit is the line mov QWORD PTR [rbp-8], OFFSET FLAT:.L4 where L4 actually corresponds to our bye label.
And from what I can see from the generated assembly, not a piece of code from the part after hi was added into the generated code.
But at least the application runs and at least has some code for comparing c to 5.
On the other end, clang, with O0 generates the following nightmare, which by the way crashes:
# else goto bye is UNCOMMENTED
# clang -O0 also doesn't emit any instructions for the hi: block
fun: # #fun
push rbp
mov rbp, rsp
mov eax, 5
pop rbp
ret
main: # #main
push rbp
mov rbp, rsp
sub rsp, 48
mov dword ptr [rbp - 4], 0
mov dword ptr [rbp - 8], edi
mov qword ptr [rbp - 16], rsi
mov qword ptr [rbp - 24], 1
mov rax, qword ptr [rbp - 24]
mov qword ptr [rbp - 32], rax
call qword ptr [rbp - 32]
mov dword ptr [rbp - 36], eax
mov esi, dword ptr [rbp - 36]
movabs rdi, offset .L.str
mov al, 0
call printf
cmp dword ptr [rbp - 36], 5
jne .LBB1_2
jmp .LBB1_3
.LBB1_2:
jmp .LBB1_3
.LBB1_3:
mov eax, 66
add rsp, 48
pop rbp
ret
.L.str:
.asciz "TT: %d\n"
If we turn on some optimization, for example O1, we get from gcc:
# else goto bye is UNCOMMENTED
# gcc -O1
fun:
mov eax, 5
ret
.LC0:
.string "TT: %d\n"
main:
sub rsp, 8
mov eax, OFFSET FLAT:.L3
call rax
mov esi, eax
mov edi, OFFSET FLAT:.LC0
mov eax, 0
call printf
.L3:
mov eax, 66
add rsp, 8
ret
and the application crashes, which is sort of understandable. Again, the compiler had entirely removed our hi section (mov eax, OFFSET FLAT:.L3 goes tiptoe to L3 which corresponds to our bye section) and unfortunately decided that it's a good idea to increase rsp before a ret so to be sure we end up somewhere totally different where we need to be.
And clang delivers something even more dubious:
# else goto bye is UNCOMMENTED
# clang -O1
fun: # #fun
mov eax, 5
ret
main: # #main
push rax
mov eax, 1
call rax
mov edi, offset .L.str
mov esi, eax
xor eax, eax
call printf
mov eax, 66
pop rcx
ret
.L.str:
.asciz "TT: %d\n"
1 ? How on earth did clang end up with this?
To some level I understand that the compiler decided that dead code after an if where both if and else go to the same location is not needed, but here my knowledge and insight stops.
So now, dear C and C++ gurus, assembly aficionados and compiler crushers, here comes the question:
Why?
Why do you think did the compiler decide that the two labels should be considered equivalent if we have added the else branch, or why did clang put there 1, and last but not least: someone with a deep understanding of the C standard could maybe point out where this piece of code deviated so badly from normality that we ended up in this really really weird situation.
someone with a deep understanding of the C standard could maybe point out where this piece of code deviated so badly from normality that we ended up in this really really weird situation.
You think the ISO C standard has anything to say about this code? It's chock full of UB and GNU extensions, notably pointers to local labels.
Casting a label pointer to a function pointer and calling through it is obviously UB. The GCC manual doesn't say you can do that. It's also UB to goto a label in another function.
You were only able to make that work by tricking the compiler into thinking that block might be reached so it's not removed, then using GNU C Basic asm statements to emit a ret instruction there.
GCC and clang remove dead code even with optimization disabled; e.g. if(0) { ... } doesn't emit any instructions to implement the ...
Also note that the c=5 in hi: compiles with optimization fully disabled (and else goto bye commented) to asm like movl $5, -20(%rbp). i.e. using the caller's RBP to modify local variables in the stack frame of the caller. So you have a nested function.
GNU C allows you to define nested functions that can access the local vars of their parent scope. (If you liked the asm you got from your experiment, you'll love the executable trampoline of machine-code that GCC stores to the stack with mov-immediate if you take a pointer to a nested function!)
asm volatile ("movl $5, %eax"); is missing a clobber on EAX. You step on the compiler's toes which would be UB if this statement was ever reached normally, rather than as if it were a separate function.
The use-case for GNU C Basic asm (no constraints / clobbers) is instructions like cli (disable interrupts), not anything involving integer registers, and definitely not ret.
If you want to define a callable function using inline asm, you can use asm("") at global scope, or as the body of an __attribute__((naked)) function.
I'm reading a file using std::ifstream:
printf("Before stream initialization\n");
ifstream stream(file_path, ios::binary);
printf("Stream initialized\n");
ifstream::pos_type position = stream.tellg();
auto file_size = position;
printf("Position acquired\n");
However, the program crashes in the release mode of the binary. Here is the compiled assembly code snippet:
.text:0000000000413411 lea rcx, aBeforeStreamIn ; "Before stream initialization\n"
.text:0000000000413418 mov rbx, rax
.text:000000000041341B call _ZL6printfPKcz ; printf(char const*,...)
.text:000000000041341B ; } // starts at 41340C
.text:0000000000413420 lea rdi, [rsp+878h+var_248]
.text:0000000000413428 lea rcx, [rdi+0D8h] ; this
.text:000000000041342F mov [rsp+878h+var_820], rdi
.text:0000000000413434 call _ZNSt8ios_baseC1Ev ; std::ios_base::ios_base(void)
.text:0000000000413439 xor r8d, r8d
.text:000000000041343C mov rax, cs:_refptr__ZTVSt9basic_iosIcSt11char_traitsIcEE
.text:0000000000413443 xor edx, edx
.text:0000000000413445 mov [rsp+878h+var_90], r8w
.text:000000000041344E pxor xmm0, xmm0
.text:0000000000413452 movaps [rsp+878h+var_88], xmm0
.text:000000000041345A movaps [rsp+878h+var_78], xmm0
.text:0000000000413462 mov [rsp+878h+var_98], 0
.text:000000000041346E add rax, 10h
.text:0000000000413472 mov [rsp+878h+var_170], rax
.text:000000000041347A mov rax, cs:_refptr__ZTTSt14basic_ifstreamIcSt11char_traitsIcEE
.text:0000000000413481 mov rsi, [rax+8]
.text:0000000000413485 mov rcx, [rax+10h]
.text:0000000000413489 mov rax, [rsi-18h]
.text:000000000041348D mov [rsp+878h+var_248], rsi
.text:0000000000413495 mov [rsp+878h+var_7E8], rcx
.text:000000000041349D mov [rsp+878h+var_7F0], rsi
.text:00000000004134A5 mov [rsp+rax+878h+var_248], rcx
.text:00000000004134AD mov [rsp+878h+var_240], 0
.text:00000000004134B9 mov rcx, [rsi-18h]
.text:00000000004134BD add rcx, rdi
.text:00000000004134C0 ; try {
.text:00000000004134C0 call _ZNSt9basic_iosIcSt11char_traitsIcEE4initEPSt15basic_streambufIcS1_E ; std::basic_ios<char,std::char_traits<char>>::init(std::basic_streambuf<char,std::char_traits<char>> *)
.text:00000000004134C0 ; } // starts at 4134C0
.text:00000000004134C5 mov rax, cs:_refptr__ZTVSt14basic_ifstreamIcSt11char_traitsIcEE
.text:00000000004134CC lea rcx, [rdi+10h]
.text:00000000004134D0 add rax, 18h
.text:00000000004134D4 mov [rsp+878h+var_248], rax
.text:00000000004134DC mov rax, cs:_refptr__ZTVSt14basic_ifstreamIcSt11char_traitsIcEE
.text:00000000004134E3 add rax, 40h
.text:00000000004134E7 mov [rsp+878h+var_170], rax
.text:00000000004134EF ; try {
.text:00000000004134EF call _ZNSt13basic_filebufIcSt11char_traitsIcEEC1Ev ; std::basic_filebuf<char,std::char_traits<char>>::basic_filebuf(void)
.text:00000000004134EF ; } // starts at 4134EF
.text:00000000004134F4 lea rdx, [rdi+10h]
.text:00000000004134F8 lea rcx, [rdi+0D8h]
.text:00000000004134FF ; try {
.text:00000000004134FF call _ZNSt9basic_iosIcSt11char_traitsIcEE4initEPSt15basic_streambufIcS1_E ; std::basic_ios<char,std::char_traits<char>>::init(std::basic_streambuf<char,std::char_traits<char>> *)
.text:0000000000413504 lea rcx, [rdi+10h]
.text:0000000000413508 mov r8d, 0Eh
.text:000000000041350E mov rdx, rbx
.text:0000000000413511 call _ZNSt13basic_filebufIcSt11char_traitsIcEE4openEPKcSt13_Ios_Openmode ; std::basic_filebuf<char,std::char_traits<char>>::open(char const*,std::_Ios_Openmode)
.text:0000000000413516 mov rdx, [rsp+878h+var_248]
.text:000000000041351E add rdi, [rdx-18h]
.text:0000000000413522 test rax, rax
.text:0000000000413525 mov rcx, rdi
.text:0000000000413528 jz loc_414688
.text:000000000041352E xor edx, edx
.text:0000000000413530 call _ZNSt9basic_iosIcSt11char_traitsIcEE5clearESt12_Ios_Iostate ; std::basic_ios<char,std::char_traits<char>>::clear(std::_Ios_Iostate)
.text:0000000000413530 ; } // starts at 4134FF
.text:0000000000413535
.text:0000000000413535 loc_413535: ; CODE XREF: PointerSearcher::parse_pointer_map(void)+1363↓j
.text:0000000000413535 lea rcx, aStreamInitiali ; "Stream initialized\n"
.text:000000000041353C ; try {
.text:000000000041353C call _ZL6printfPKcz ; printf(char const*,...)
In my function it crashes at this line:
.text:0000000000413504 lea rcx, [rdi+10h]
The output is:
Before stream initialization
Process finished with exit code -1073741819 (0xC0000409)
The stacktrace is:
std::locale::operator=(std::locale const&)
std::ios_base::_M_init()
std::basic_ios<char, std::char_traits<char> >::init(std::basic_streambuf<char, std::char_traits<char> >*)
MyExecutable::myFunction()
The crash only happens in the Windows binary. The binary works in release mode for Linux. I'm using the MinGW compiler to compile the Windows binary and the compilation flags are:
-fopenmp -O3 -DNDEBUG
They're the default CMake release build flags. I also made sure the passed file_path is correct.
gdb says:
Thread 1 received signal SIGSEGV, Segmentation fault.
0x00000000004a2521 in std::locale::operator=(std::locale const&) ()
Thread 1 received signal SIGSEGV, Segmentation fault.
0x00000000004a2521 in std::locale::operator=(std::locale const&) ()
[Thread 48616.0xc508 exited with code 3221225477]
[Thread 48616.0xc510 exited with code 3221225477]
[Thread 48616.0xc638 exited with code 3221225477]
[Inferior 1 (process 48616) exited with code 030000000005]
The compiler version:
"C:\Program Files\mingw-w64\x86_64-8.1.0-win32-seh-rt_v6-rev0\mingw64\bin\x86_64-w64-mingw32-gcc.exe" --version
x86_64-w64-mingw32-gcc.exe (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 8.1.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Does anyone have an idea what went wrong and how to fix it?
This seems to be a MinGW compiler bug since when using MSVC in Visual Studio to compile the code, the same exception does not occur either.
Consider the following code, in C++:
#include <cstdlib>
std::size_t count(std::size_t n)
{
std::size_t i = 0;
while (i < n) {
asm volatile("": : :"memory");
++i;
}
return i;
}
int main(int argc, char* argv[])
{
return count(argc > 1 ? std::atoll(argv[1]) : 1);
}
It is just a loop that is incrementing its value, and returns it at the end. The asm volatile prevents the loop from being optimized away. We compile it under g++ 8.1 and clang++ 5.0 with the arguments -Wall -Wextra -std=c++11 -g -O3.
Now, if we look at what compiler explorer is producing, we have, for g++:
count(unsigned long):
mov rax, rdi
test rdi, rdi
je .L2
xor edx, edx
.L3:
add rdx, 1
cmp rax, rdx
jne .L3
.L2:
ret
main:
mov eax, 1
xor edx, edx
cmp edi, 1
jg .L25
.L21:
add rdx, 1
cmp rdx, rax
jb .L21
mov eax, edx
ret
.L25:
push rcx
mov rdi, QWORD PTR [rsi+8]
mov edx, 10
xor esi, esi
call strtoll
mov rdx, rax
test rax, rax
je .L11
xor edx, edx
.L12:
add rdx, 1
cmp rdx, rax
jb .L12
.L11:
mov eax, edx
pop rdx
ret
and for clang++:
count(unsigned long): # #count(unsigned long)
test rdi, rdi
je .LBB0_1
mov rax, rdi
.LBB0_3: # =>This Inner Loop Header: Depth=1
dec rax
jne .LBB0_3
mov rax, rdi
ret
.LBB0_1:
xor edi, edi
mov rax, rdi
ret
main: # #main
push rbx
cmp edi, 2
jl .LBB1_1
mov rdi, qword ptr [rsi + 8]
xor ebx, ebx
xor esi, esi
mov edx, 10
call strtoll
test rax, rax
jne .LBB1_3
mov eax, ebx
pop rbx
ret
.LBB1_1:
mov eax, 1
.LBB1_3:
mov rcx, rax
.LBB1_4: # =>This Inner Loop Header: Depth=1
dec rcx
jne .LBB1_4
mov rbx, rax
mov eax, ebx
pop rbx
ret
Understanding the code generated by g++, is not that complicated, the loop being:
.L3:
add rdx, 1
cmp rax, rdx
jne .L3
every iteration increments rdx, and compares it to rax that stores the size of the loop.
Now, I have no idea of what clang++ is doing. Apparently it uses dec, which is weird to me, and I don't even understand where the actual loop is. My question is the following: what is clang doing?
(I am looking for comments about the clang assembly code to describe what is done at each step and how it actually works).
The effect of the function is to return n, either by counting up to n and returning the result, or by simply returning the passed-in value of n. The clang code does the latter. The counting loop is here:
mov rax, rdi
.LBB0_3: # =>This Inner Loop Header: Depth=1
dec rax
jne .LBB0_3
mov rax, rdi
ret
It begins by copying the value of n into rax. It decrements the value in rax, and if the result is not 0, it jumps back to .LBB0_3. If the value is 0 it falls through to the next instruction, which copies the original value of n into rax and returns.
There is no i stored, but the code does the loop the prescribed number of times, and returns the value that i would have had, namely, n.
I am trying to debug a tricky core dump (from an -O2 optimized binary).
// Caller Function
void caller(Container* c)
{
std::list < Message*> msgs;
if(!decoder.called(c->buf_, msgs))
{
....
.....
}
// Called Function
bool
Decoder::called(Buffer* buf, list < Message*>& msgs)
{
add_data(buf); // Inlined code to append buf to decoders buf chain
while(m_data_in && m_data_in->length() > 0)
{
.....
}
}
In both caller and the callee, the first argument is optimized out, that means it must be somewhere in the register.
Caller Disassembly:
push %r15
mov %rdi,%r15
push %r14
push %r13
push %r12
push %rbp
push %rbx
sub $0x68,%rsp
test %rsi,%rsi
je 0x8ccd62
cmpq $0x0,(%rsi)
je 0x8ccd62
lea 0x40(%rsp),%rax
lea 0x1b8(%rdi),%rdi
mov %rax,(%rsp)
mov %rax,0x40(%rsp)
mov %rax,%rdx
mov %rax,0x48(%rsp)
mov (%rsi),%rsi
callq 0x8cc820
Caller Register Info:
rax 0x7fbfffc7e0 548682057696
rbx 0x2a97905ba0 182931446688
rcx 0x0 0
rdx 0x2 2
rsi 0x1 1
rdi 0x7fbfffc7e2 548682057698
rbp 0x4f 0x4f
rsp 0x7fbfffc870 0x7fbfffc870
r8 0x40 64
r9 0x20 32
r10 0x7fbfffc7e0 548682057696
r11 0x2abe466600 183580911104
r12 0x7fbfffd910 548682062096 // THIS IS HOLDING buf_
r13 0x7fbfffdec0 548682063552
r14 0x5dc 1500
r15 0x2a97905ba0 182931446688
rip 0x8cca89 0x8cca89
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
Called function Disassembly:
push %r14
push %r13
mov %rdx,%r13
push %r12
mov %rdi,%r12
push %rbp
push %rbx
sub $0x10,%rsp
mov 0x8(%rdi),%rdx
test %rdx,%rdx
jne 0x8cc843
jmpq 0x8cc9cb
mov %rax,%rdx
mov 0x8(%rdx),%rax
test %rax,%rax
mov %rsi,0x8(%rdx)
mov 0x8(%r12),%rax
test %rax,%rax
xor %edx,%edx
add 0x4(%rax),%edx
mov 0x8(%rax),%rax
lea 0x8(%rsp),%rsi
mov %r12,%rdi
movq $0x0,0x8(%rsp)
Called function Register Info :
rax 0x7fbfffc7e0 548682057696
rbx 0x2abc49f9c0 183547591104
rcx 0x0 0
rdx 0x2 2
rsi 0x1 1
rdi 0x7fbfffc7e2 548682057698
rbp 0xffffffff 0xffffffff
rsp 0x7fbfffc830 0x7fbfffc830
r8 0x40 64
r9 0x20 32
r10 0x7fbfffc7e0 548682057696
r11 0x2abe466600 183580911104
r12 0x2a97905d58 182931447128
r13 0x7fbfffc8b0 548682057904
r14 0x5dc 1500
r15 0x2a97905ba0 182931446688
rip 0x8cc88a 0x8cc88a
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
The issue is, in the called function, it appears that "add_data" function achieved nothing.
So, wanted to know whether in disassembly of called function, do we see the "buf_" pointer being used anywhere (Register r12 in callee function).
I do understand assembly to some level, but all those code inlining has left me confused.
Would appreciate some help in demistifying called function disassembly.
UPDATE:
add_data does below:
if (m_data_in) {
m_data_in->next = data;
} else {
m_data_in = data;
}
This looks like if (m_data_in)
mov 0x8(%rdi),%rdx
test %rdx,%rdx
test %rdx,%rdx
jne 0x8cc843
jmpq 0x8cc9cb
Now, I don't quite know where 0x8cc843 and 0x8cc9cb are located in your code, so can't really follow the code further. There is still not enough code & information to say exactly what is going on in the original question. I'm happy to fill in more of this answer if more information is provided.
This question already has an answer here:
Dual emission of constructor symbols
(1 answer)
Closed 9 years ago.
c++ codes
#include <cstdio>
#include <cstdlib>
struct trivialStruct
{
trivialStruct();
~trivialStruct();
int *a;
float *b;
float *c;
};
trivialStruct::trivialStruct() : a((int*)malloc(sizeof(int))), b((float*)malloc(sizeof(float))), c((float*)malloc(sizeof(float)))
{
*a = 100;
*b = 200;
*c = 300;
}
trivialStruct::~trivialStruct()
{
free(a);
free(b);
free(c);
a = nullptr;
b = nullptr;
c = nullptr;
}
int main()
{
trivialStruct A;
printf("%d, %f, %f", *A.a, *A.b, *A.c);
return 0;
}
assembly
.section __TEXT,__text,regular,pure_instructions
.globl __ZN13trivialStructC1Ev
.align 4, 0x90
__ZN13trivialStructC1Ev: ## #_ZN13trivialStructC1Ev
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp3:
.cfi_def_cfa_offset 16
Ltmp4:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp5:
.cfi_def_cfa_register rbp
push R15
push R14
push RBX
push RAX
Ltmp6:
.cfi_offset rbx, -40
Ltmp7:
.cfi_offset r14, -32
Ltmp8:
.cfi_offset r15, -24
mov RBX, RDI
mov EDI, 4
call _malloc
mov R14, RAX
mov QWORD PTR [RBX], R14
mov EDI, 4
call _malloc
mov R15, RAX
mov QWORD PTR [RBX + 8], R15
mov EDI, 4
call _malloc
mov QWORD PTR [RBX + 16], RAX
mov DWORD PTR [R14], 100
mov DWORD PTR [R15], 1128792064
mov DWORD PTR [RAX], 1133903872
add RSP, 8
pop RBX
pop R14
pop R15
pop RBP
ret
.cfi_endproc
.globl __ZN13trivialStructC2Ev
.align 4, 0x90
__ZN13trivialStructC2Ev: ## #_ZN13trivialStructC2Ev
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp12:
.cfi_def_cfa_offset 16
Ltmp13:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp14:
.cfi_def_cfa_register rbp
push R15
push R14
push RBX
push RAX
Ltmp15:
.cfi_offset rbx, -40
Ltmp16:
.cfi_offset r14, -32
Ltmp17:
.cfi_offset r15, -24
mov RBX, RDI
mov EDI, 4
call _malloc
mov R14, RAX
mov QWORD PTR [RBX], R14
mov EDI, 4
call _malloc
mov R15, RAX
mov QWORD PTR [RBX + 8], R15
mov EDI, 4
call _malloc
mov QWORD PTR [RBX + 16], RAX
mov DWORD PTR [R14], 100
mov DWORD PTR [R15], 1128792064
mov DWORD PTR [RAX], 1133903872
add RSP, 8
pop RBX
pop R14
pop R15
pop RBP
ret
.cfi_endproc
.globl __ZN13trivialStructD1Ev
.align 4, 0x90
__ZN13trivialStructD1Ev: ## #_ZN13trivialStructD1Ev
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp21:
.cfi_def_cfa_offset 16
Ltmp22:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp23:
.cfi_def_cfa_register rbp
push RBX
push RAX
Ltmp24:
.cfi_offset rbx, -24
mov RBX, RDI
mov RDI, QWORD PTR [RBX]
call _free
mov RDI, QWORD PTR [RBX + 8]
call _free
mov RDI, QWORD PTR [RBX + 16]
call _free
mov QWORD PTR [RBX + 16], 0
mov QWORD PTR [RBX + 8], 0
mov QWORD PTR [RBX], 0
add RSP, 8
pop RBX
pop RBP
ret
.cfi_endproc
.globl __ZN13trivialStructD2Ev
.align 4, 0x90
__ZN13trivialStructD2Ev: ## #_ZN13trivialStructD2Ev
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp28:
.cfi_def_cfa_offset 16
Ltmp29:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp30:
.cfi_def_cfa_register rbp
push RBX
push RAX
Ltmp31:
.cfi_offset rbx, -24
mov RBX, RDI
mov RDI, QWORD PTR [RBX]
call _free
mov RDI, QWORD PTR [RBX + 8]
call _free
mov RDI, QWORD PTR [RBX + 16]
call _free
mov QWORD PTR [RBX + 16], 0
mov QWORD PTR [RBX + 8], 0
mov QWORD PTR [RBX], 0
add RSP, 8
pop RBX
pop RBP
ret
.cfi_endproc
.section __TEXT,__literal8,8byte_literals
.align 3
LCPI4_0:
.quad 4641240890982006784 ## double 200
LCPI4_1:
.quad 4643985272004935680 ## double 300
.section __TEXT,__text,regular,pure_instructions
.globl _main
.align 4, 0x90
_main: ## #main
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp34:
.cfi_def_cfa_offset 16
Ltmp35:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp36:
.cfi_def_cfa_register rbp
lea RDI, QWORD PTR [RIP + L_.str]
movsd XMM0, QWORD PTR [RIP + LCPI4_0]
movsd XMM1, QWORD PTR [RIP + LCPI4_1]
mov ESI, 100
mov AL, 2
call _printf
xor EAX, EAX
pop RBP
ret
.cfi_endproc
.section __TEXT,__cstring,cstring_literals
L_.str: ## #.str
.asciz "%d, %f, %f"
.subsections_via_symbols
command
clang++ -S -O2 -std=c++11 -mllvm --x86-asm-syntax=intel -fno-exceptions main.cpp
As you can see, there are two part of codes are same(constructor and destructor)
__ZN13trivialStructC1Ev: ## #_ZN13trivialStructC1Ev
__ZN13trivialStructC2Ev: ## #_ZN13trivialStructC2Ev
__ZN13trivialStructD1Ev: ## #_ZN13trivialStructD1Ev
__ZN13trivialStructD2Ev: ## #_ZN13trivialStructD2Ev
I have no idea why the compiler generate two part of codes but not just one?
I am no familiar with assembly, but looks like this just make the codes become
fatter(and maybe slower).
This is part of the ABI for your platform, and escapes the standard. Both constructors and destructors can generate multiple symbols in the binary. For example, the Itanium C++ABI will generate up to 3 constructors/destructors:
complete object constructor
base object constructor
complete object allocating constructor
deleting destructor
complete object destructor
base object destructor
The different symbols take on slightly different responsibilities as the implementation might need to do different things depending on how the object is being created/destroyed. In your particular case, the code is simple enough that all constructors might generate exactly the same code, but they need to be there to comply with the ABI, and the ABI has them to enable more complex use cases.
For example, a complete object constructor will initialize virtual bases, while the base object constructor will skip this part of construction. If there is multiple/virtual inheritance and virtual functions, the vptr in the complete object may have to jump through different sets of intermediate tables depending on how this subobject is being instantiated.
If you want an explanation other than the ABI mandates, you should take a look at the documentation for your particular ABI. You can also take a look at Inside the C++ object model that even if old, contains a good description of what the problems to solve are and some of the solutions provided.