I am trying to debug a tricky core dump (from an -O2 optimized binary).
// Caller Function
void caller(Container* c)
{
std::list < Message*> msgs;
if(!decoder.called(c->buf_, msgs))
{
....
.....
}
// Called Function
bool
Decoder::called(Buffer* buf, list < Message*>& msgs)
{
add_data(buf); // Inlined code to append buf to decoders buf chain
while(m_data_in && m_data_in->length() > 0)
{
.....
}
}
In both caller and the callee, the first argument is optimized out, that means it must be somewhere in the register.
Caller Disassembly:
push %r15
mov %rdi,%r15
push %r14
push %r13
push %r12
push %rbp
push %rbx
sub $0x68,%rsp
test %rsi,%rsi
je 0x8ccd62
cmpq $0x0,(%rsi)
je 0x8ccd62
lea 0x40(%rsp),%rax
lea 0x1b8(%rdi),%rdi
mov %rax,(%rsp)
mov %rax,0x40(%rsp)
mov %rax,%rdx
mov %rax,0x48(%rsp)
mov (%rsi),%rsi
callq 0x8cc820
Caller Register Info:
rax 0x7fbfffc7e0 548682057696
rbx 0x2a97905ba0 182931446688
rcx 0x0 0
rdx 0x2 2
rsi 0x1 1
rdi 0x7fbfffc7e2 548682057698
rbp 0x4f 0x4f
rsp 0x7fbfffc870 0x7fbfffc870
r8 0x40 64
r9 0x20 32
r10 0x7fbfffc7e0 548682057696
r11 0x2abe466600 183580911104
r12 0x7fbfffd910 548682062096 // THIS IS HOLDING buf_
r13 0x7fbfffdec0 548682063552
r14 0x5dc 1500
r15 0x2a97905ba0 182931446688
rip 0x8cca89 0x8cca89
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
Called function Disassembly:
push %r14
push %r13
mov %rdx,%r13
push %r12
mov %rdi,%r12
push %rbp
push %rbx
sub $0x10,%rsp
mov 0x8(%rdi),%rdx
test %rdx,%rdx
jne 0x8cc843
jmpq 0x8cc9cb
mov %rax,%rdx
mov 0x8(%rdx),%rax
test %rax,%rax
mov %rsi,0x8(%rdx)
mov 0x8(%r12),%rax
test %rax,%rax
xor %edx,%edx
add 0x4(%rax),%edx
mov 0x8(%rax),%rax
lea 0x8(%rsp),%rsi
mov %r12,%rdi
movq $0x0,0x8(%rsp)
Called function Register Info :
rax 0x7fbfffc7e0 548682057696
rbx 0x2abc49f9c0 183547591104
rcx 0x0 0
rdx 0x2 2
rsi 0x1 1
rdi 0x7fbfffc7e2 548682057698
rbp 0xffffffff 0xffffffff
rsp 0x7fbfffc830 0x7fbfffc830
r8 0x40 64
r9 0x20 32
r10 0x7fbfffc7e0 548682057696
r11 0x2abe466600 183580911104
r12 0x2a97905d58 182931447128
r13 0x7fbfffc8b0 548682057904
r14 0x5dc 1500
r15 0x2a97905ba0 182931446688
rip 0x8cc88a 0x8cc88a
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
The issue is, in the called function, it appears that "add_data" function achieved nothing.
So, wanted to know whether in disassembly of called function, do we see the "buf_" pointer being used anywhere (Register r12 in callee function).
I do understand assembly to some level, but all those code inlining has left me confused.
Would appreciate some help in demistifying called function disassembly.
UPDATE:
add_data does below:
if (m_data_in) {
m_data_in->next = data;
} else {
m_data_in = data;
}
This looks like if (m_data_in)
mov 0x8(%rdi),%rdx
test %rdx,%rdx
test %rdx,%rdx
jne 0x8cc843
jmpq 0x8cc9cb
Now, I don't quite know where 0x8cc843 and 0x8cc9cb are located in your code, so can't really follow the code further. There is still not enough code & information to say exactly what is going on in the original question. I'm happy to fill in more of this answer if more information is provided.
Related
Any idea why code that looks like this
list<Foo> fooList;
processList(&fooList);
Generates the following machine code
lea rax, [rbp-48]
mov rdi, rax
call processList(std::__cxx11::list<Foo, std::allocator<Foo> >*)
lea rax, [rbp-48]
mov rdi, rax
call std::__cxx11::list<Foo, std::allocator<Foo> >::~list()
jmp .L11
mov rbx, rax
lea rax, [rbp-48]
mov rdi, rax
call std::__cxx11::list<Foo, std::allocator<Foo> >::~list()
mov rax, rbx
mov rdi, rax
call _Unwind_Resume
.L11:
add rsp, 40
pop rbx
pop rbp
ret
In particular, I don't see any paths leading to the line after the unconditional jmp .L11
(this is with GCC 6.2 with no optimization, generated on compiler explorer)
For comparison, clang 5.0.0 produces
call processList(std::__cxx11::list<Foo, std::allocator<Foo> >*)
jmp .LBB5_1
.LBB5_1:
lea rdi, [rbp - 24]
call std::__cxx11::list<Foo, std::allocator<Foo> >::~list()
add rsp, 48
pop rbp
ret
lea rdi, [rbp - 24]
mov ecx, edx
mov qword ptr [rbp - 32], rax
mov dword ptr [rbp - 36], ecx
call std::__cxx11::list<Foo, std::allocator<Foo> >::~list()
mov rdi, qword ptr [rbp - 32]
call _Unwind_Resume
Again there is an unconditional jump to a return block, and and unwind block (starting with the second lea rdi) that seems unreachable.
After a bit of research on C++ exception mechanisms, my conclusion is that the process is as follows:
At the point of exception throw, __cxa_throw gets called. This is somewhat like longjmp() in that the function gets called but never returns. The function performs two main tasks
It walks up the call stack looking for a catch. If it doesn't find any, std::terminate gets called.
If it does find a catch block then it calls all of the unwind handlers between the current function and the catch block, then calls the catch block.
Back to my original machine code (with filtering turned off in compiler explorer). My comments after the hashes.
# this is the normative path
call std::list<Handle, std::allocator<Handle> >::~list()
# unconditional jump around the unwind handler
jmp .L11
.L10:
# unwind handler code, calls the local variable destructor
mov rbx, rax
.loc 2 30 0
lea rax, [rbp-32]
mov rdi, rax
call std::list<Handle, std::allocator<Foo> >::~list()
mov rax, rbx
mov rdi, rax
.LEHB1:
# carry on unwinding
call _Unwind_Resume
.L11:
Then there is the exception table
.section .gcc_except_table,"a",#progbits
.LLSDA1386:
.byte 0xff
.byte 0xff
.byte 0x1
.uleb128 .LLSDACSE1386-.LLSDACSB1386
.LLSDACSB1386:
# entry for unwind handler
.uleb128 .LEHB0-.LFB1386
.uleb128 .LEHE0-.LEHB0
.uleb128 .L10-.LFB1386
.uleb128 0
.uleb128 .LEHB1-.LFB1386
.uleb128 .LEHE1-.LEHB1
.uleb128 0
.uleb128 0
I guess that the unwind handler function can work out the positions of the unwind handler blocks from the addresses on the stack and the offsets in this table.
I'm new to PIN so perhaps there is an easy explanation to this. I'm puzzled by routine addresses PIN returns on Windows only. I created a minimal test program to illustrate my point.
I'm currently using PIN 2.14. My inspected application is a Debug build, with disabled ASLR on Windows.
First consider this simple application that calls an empty method and prints the method's address:
#include "stdio.h"
class Test {
public:
void method() {}
};
void main() {
Test t;
t.method();
printf("Test::method = 0x%p\n", &Test::method);
}
The following pin tool will disassemble a program's main routine and print the address of Test::method:
#include <pin.H>
#include "stdio.h"
VOID disassemble(IMG img, VOID *v)
{
for (SEC sec = IMG_SecHead(img); SEC_Valid(sec); sec = SEC_Next(sec))
{
for (RTN rtn = SEC_RtnHead(sec); RTN_Valid(rtn); rtn = RTN_Next(rtn))
{
auto name = RTN_Name(rtn);
if(name == "Test::method" || name == "_ZN4Test6methodEv") {
printf("%s detected by PIN resides at 0x%p.\n", name.c_str(), RTN_Address(rtn));
}
if (RTN_Name(rtn) != "main") continue;
RTN_Open(rtn);
for (INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins)) {
printf("%s\n", INS_Disassemble(ins).c_str());
}
RTN_Close(rtn);
}
}
}
int main(int argc, char **argv)
{
PIN_InitSymbols();
PIN_Init(argc, argv);
IMG_AddInstrumentFunction(disassemble, 0);
PIN_StartProgram();
return 0;
}
Running this PIN tool against the test application on an Ubuntu 14.04.3 x64 VM prints:
push rbp
mov rbp, rsp
push r13
push r12
push rbx
sub rsp, 0x18
lea rax, ptr [rbp-0x21]
mov rdi, rax
call 0x400820
mov r12d, 0x400820
mov r13d, 0x0
mov rcx, r12
mov rbx, r13
mov rax, r12
mov rdx, r13
mov rax, rdx
mov rsi, rcx
mov rdx, rax
mov edi, 0x4008b4
mov eax, 0x0
call 0x4006a0
mov eax, 0x0
add rsp, 0x18
pop rbx
pop r12
pop r13
pop rbp
ret
_ZN4Test6methodEv detected by PIN resides at 0x0x400820.
Test::method = 0x0x400820
Please note that both the call targeting t.method(), as well as the function address retrieved by PIN and the application's output show Test::method to reside at address 0x0x400820.
The output on my Windows 10 x64 machine is:
push rdi
sub rsp, 0x40
mov rdi, rsp
mov ecx, 0x10
mov eax, 0xcccccccc
rep stosd dword ptr [rdi]
lea rcx, ptr [rsp+0x24]
call 0x14000104b
lea rdx, ptr [rip-0x2b]
lea rcx, ptr [rip+0x74ca3]
call 0x1400013f0
xor eax, eax
mov edi, eax
mov rcx, rsp
lea rdx, ptr [rip+0x74cf0]
call 0x1400015f0
mov eax, edi
add rsp, 0x40
pop rdi
ret
Test::method detected by PIN resides at 0x0000000140001120.
Test::method = 0x000000014000104B
The application's output and the call target in the disassembly show the same value. However the routine address returned by PIN is different!
I'm very puzzled about this behavior. Do you have any idea how to explain this?
Thanks for any suggestions!
Answering my question for myself:
After probing for a while it became evident that both function pointers, the call target as well as the routine address returned by PIN were valid and ended up calling the method.
I ended up looking at the in-memory diassembly of my call target's address and sure enough it looked something like this:
0000000140001122 E9 D2 00 00 00 jmp Test::method (014000104Bh)
In other words: PIN seemingly replaced the original method with a jump to its own version of the code. Knowing this, correlating original function pointers and PIN routine addresses again became possible which is all I needed to do.
I'm trying to write a "hello world" program to test inline assembler in g++.
(still leaning AT&T syntax)
The code is:
#include <stdlib.h>
#include <stdio.h>
# include <iostream>
using namespace std;
int main() {
int c,d;
__asm__ __volatile__ (
"mov %eax,1; \n\t"
"cpuid; \n\t"
"mov %edx, $d; \n\t"
"mov %ecx, $c; \n\t"
);
cout << c << " " << d << "\n";
return 0;
}
I'm getting the following error:
inline1.cpp: Assembler messages:
inline1.cpp:18: Error: unsupported instruction `mov'
inline1.cpp:19: Error: unsupported instruction `mov'
Can you help me to get it done?
Tks
Your assembly code is not valid. Please carefully read on Extended Asm. Here's another good overview.
Here is a CPUID example code from here:
static inline void cpuid(int code, uint32_t* a, uint32_t* d)
{
asm volatile ( "cpuid" : "=a"(*a), "=d"(*d) : "0"(code) : "ebx", "ecx" );
}
Note the format:
first : followed by output operands: : "=a"(*a), "=d"(*d); "=a" is eax and "=b is ebx
second : followed by input operands: : "0"(code); "0" means that code should occupy the same location as output operand 0 (eax in this case)
third : followed by clobbered registers list: : "ebx", "ecx"
I kept #AMA answer as accepted one because it was complete enough. But I've put some thought on it and I concluded that it is not 100% correct.
The code I was trying to implement in GCC is the one below (Microsoft Visual Studio version).
int c,d;
_asm
{
mov eax, 1;
cpuid;
mov d, edx;
mov c, ecx;
}
When cpuid executes with eax set to 1, feature information is returned in ecx and edx.
The suggested code returns the values from eax ("=a") and edx (="d").
This can be easily seen at gdb:
(gdb) disassemble cpuid
Dump of assembler code for function cpuid(int, uint32_t*, uint32_t*):
0x0000000000000a2a <+0>: push %rbp
0x0000000000000a2b <+1>: mov %rsp,%rbp
0x0000000000000a2e <+4>: push %rbx
0x0000000000000a2f <+5>: mov %edi,-0xc(%rbp)
0x0000000000000a32 <+8>: mov %rsi,-0x18(%rbp)
0x0000000000000a36 <+12>: mov %rdx,-0x20(%rbp)
0x0000000000000a3a <+16>: mov -0xc(%rbp),%eax
0x0000000000000a3d <+19>: cpuid
0x0000000000000a3f <+21>: mov -0x18(%rbp),%rcx
0x0000000000000a43 <+25>: mov %eax,(%rcx) <== HERE
0x0000000000000a45 <+27>: mov -0x20(%rbp),%rax
0x0000000000000a49 <+31>: mov %edx,(%rax) <== HERE
0x0000000000000a4b <+33>: nop
0x0000000000000a4c <+34>: pop %rbx
0x0000000000000a4d <+35>: pop %rbp
0x0000000000000a4e <+36>: retq
End of assembler dump.
The code that generates something closer to what I want is (EDITED based on feedbacks on the comments):
static inline void cpuid2(uint32_t* d, uint32_t* c)
{
int a = 1;
asm volatile ( "cpuid" : "=d"(*d), "=c"(*c), "+a"(a) :: "ebx" );
}
The result is:
(gdb) disassemble cpuid2
Dump of assembler code for function cpuid2(uint32_t*, uint32_t*):
0x00000000000009b0 <+0>: push %rbp
0x00000000000009b1 <+1>: mov %rsp,%rbp
0x00000000000009b4 <+4>: push %rbx
0x00000000000009b5 <+5>: mov %rdi,-0x20(%rbp)
0x00000000000009b9 <+9>: mov %rsi,-0x28(%rbp)
0x00000000000009bd <+13>: movl $0x1,-0xc(%rbp)
0x00000000000009c4 <+20>: mov -0xc(%rbp),%eax
0x00000000000009c7 <+23>: cpuid
0x00000000000009c9 <+25>: mov %edx,%esi
0x00000000000009cb <+27>: mov -0x20(%rbp),%rdx
0x00000000000009cf <+31>: mov %esi,(%rdx)
0x00000000000009d1 <+33>: mov -0x28(%rbp),%rdx
0x00000000000009d5 <+37>: mov %ecx,(%rdx)
0x00000000000009d7 <+39>: mov %eax,-0xc(%rbp)
0x00000000000009da <+42>: nop
0x00000000000009db <+43>: pop %rbx
0x00000000000009dc <+44>: pop %rbp
0x00000000000009dd <+45>: retq
End of assembler dump.
Just to be clear... I know that there are better ways of doing it. But the purpose here is purely educational. Just want to understand how it works ;-)
-- edited (removed personal opinion) ---
I've set the disassembly-flavor of the gdb-debugger to Intel (both: su & normal user), but anyway it's still showing the assembly-code in AT&T notation:
patrick#localhost:~/Dokumente/Projekte$ gdb -q ./a.out
Reading symbols from ./a.out...done.
(gdb) break main
Breakpoint 1 at 0x40050e: file firstprog.c, line 5.
(gdb) run
Starting program: /home/patrick/Dokumente/Projekte/a.out
Breakpoint 1, main () at firstprog.c:5
5 for(i=0; i < 10; i++)
(gdb) show disassembly
The disassembly flavor is "intel".
(gdb) info registers
rax 0x400506 4195590
rbx 0x0 0
rcx 0x0 0
rdx 0x7fffffffe2d8 140737488347864
rsi 0x7fffffffe2c8 140737488347848
rdi 0x1 1
rbp 0x7fffffffe1e0 0x7fffffffe1e0
(gdb) info register eip
Invalid register `eip'
I did restart the computer. My OS is Kali Linux amd64.
I have the following questions:
Why is gdb still showing the AT&T notation?
Why is the register EIP (instruction pointer) shown as invalid register?
You are misunderstanding what disassembly flavour means. It means exactly that: what the disassembly looks like when you view machine code in a human-readable(ish) form.
To print registers (or use registers in any other context), you need to use $reg, such as $rip or $pc, $eax, etc.
If I disassemble one of my programs with at&t syntax, gdb shows this:
0x00000000007378f0 <+0>: push %rbp
0x00000000007378f1 <+1>: mov %rsp,%rbp
0x00000000007378f4 <+4>: sub $0x20,%rsp
0x00000000007378f8 <+8>: movl $0x0,-0x4(%rbp)
0x00000000007378ff <+15>: mov %edi,-0x8(%rbp)
0x0000000000737902 <+18>: mov %rsi,-0x10(%rbp)
=> 0x0000000000737906 <+22>: mov -0x10(%rbp),%rsi
0x000000000073790a <+26>: mov (%rsi),%rdi
0x000000000073790d <+29>: callq 0x737950 <FindLibPath(char const*)>
0x0000000000737912 <+34>: xor %eax,%eax
Then do this:
(gdb) set disassembly-flavor intel
(gdb) disass main
Dump of assembler code for function main(int, char**):
0x00000000007378f0 <+0>: push rbp
0x00000000007378f1 <+1>: mov rbp,rsp
0x00000000007378f4 <+4>: sub rsp,0x20
0x00000000007378f8 <+8>: mov DWORD PTR [rbp-0x4],0x0
0x00000000007378ff <+15>: mov DWORD PTR [rbp-0x8],edi
0x0000000000737902 <+18>: mov QWORD PTR [rbp-0x10],rsi
=> 0x0000000000737906 <+22>: mov rsi,QWORD PTR [rbp-0x10]
0x000000000073790a <+26>: mov rdi,QWORD PTR [rsi]
0x000000000073790d <+29>: call 0x737950 <FindLibPath(char const*)>
0x0000000000737912 <+34>: xor eax,eax
and you can see the difference. But the names of registers and how you use registers on the gdb command-line isn't changing, you need a $reg in both cases.
This question already has an answer here:
Dual emission of constructor symbols
(1 answer)
Closed 9 years ago.
c++ codes
#include <cstdio>
#include <cstdlib>
struct trivialStruct
{
trivialStruct();
~trivialStruct();
int *a;
float *b;
float *c;
};
trivialStruct::trivialStruct() : a((int*)malloc(sizeof(int))), b((float*)malloc(sizeof(float))), c((float*)malloc(sizeof(float)))
{
*a = 100;
*b = 200;
*c = 300;
}
trivialStruct::~trivialStruct()
{
free(a);
free(b);
free(c);
a = nullptr;
b = nullptr;
c = nullptr;
}
int main()
{
trivialStruct A;
printf("%d, %f, %f", *A.a, *A.b, *A.c);
return 0;
}
assembly
.section __TEXT,__text,regular,pure_instructions
.globl __ZN13trivialStructC1Ev
.align 4, 0x90
__ZN13trivialStructC1Ev: ## #_ZN13trivialStructC1Ev
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp3:
.cfi_def_cfa_offset 16
Ltmp4:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp5:
.cfi_def_cfa_register rbp
push R15
push R14
push RBX
push RAX
Ltmp6:
.cfi_offset rbx, -40
Ltmp7:
.cfi_offset r14, -32
Ltmp8:
.cfi_offset r15, -24
mov RBX, RDI
mov EDI, 4
call _malloc
mov R14, RAX
mov QWORD PTR [RBX], R14
mov EDI, 4
call _malloc
mov R15, RAX
mov QWORD PTR [RBX + 8], R15
mov EDI, 4
call _malloc
mov QWORD PTR [RBX + 16], RAX
mov DWORD PTR [R14], 100
mov DWORD PTR [R15], 1128792064
mov DWORD PTR [RAX], 1133903872
add RSP, 8
pop RBX
pop R14
pop R15
pop RBP
ret
.cfi_endproc
.globl __ZN13trivialStructC2Ev
.align 4, 0x90
__ZN13trivialStructC2Ev: ## #_ZN13trivialStructC2Ev
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp12:
.cfi_def_cfa_offset 16
Ltmp13:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp14:
.cfi_def_cfa_register rbp
push R15
push R14
push RBX
push RAX
Ltmp15:
.cfi_offset rbx, -40
Ltmp16:
.cfi_offset r14, -32
Ltmp17:
.cfi_offset r15, -24
mov RBX, RDI
mov EDI, 4
call _malloc
mov R14, RAX
mov QWORD PTR [RBX], R14
mov EDI, 4
call _malloc
mov R15, RAX
mov QWORD PTR [RBX + 8], R15
mov EDI, 4
call _malloc
mov QWORD PTR [RBX + 16], RAX
mov DWORD PTR [R14], 100
mov DWORD PTR [R15], 1128792064
mov DWORD PTR [RAX], 1133903872
add RSP, 8
pop RBX
pop R14
pop R15
pop RBP
ret
.cfi_endproc
.globl __ZN13trivialStructD1Ev
.align 4, 0x90
__ZN13trivialStructD1Ev: ## #_ZN13trivialStructD1Ev
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp21:
.cfi_def_cfa_offset 16
Ltmp22:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp23:
.cfi_def_cfa_register rbp
push RBX
push RAX
Ltmp24:
.cfi_offset rbx, -24
mov RBX, RDI
mov RDI, QWORD PTR [RBX]
call _free
mov RDI, QWORD PTR [RBX + 8]
call _free
mov RDI, QWORD PTR [RBX + 16]
call _free
mov QWORD PTR [RBX + 16], 0
mov QWORD PTR [RBX + 8], 0
mov QWORD PTR [RBX], 0
add RSP, 8
pop RBX
pop RBP
ret
.cfi_endproc
.globl __ZN13trivialStructD2Ev
.align 4, 0x90
__ZN13trivialStructD2Ev: ## #_ZN13trivialStructD2Ev
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp28:
.cfi_def_cfa_offset 16
Ltmp29:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp30:
.cfi_def_cfa_register rbp
push RBX
push RAX
Ltmp31:
.cfi_offset rbx, -24
mov RBX, RDI
mov RDI, QWORD PTR [RBX]
call _free
mov RDI, QWORD PTR [RBX + 8]
call _free
mov RDI, QWORD PTR [RBX + 16]
call _free
mov QWORD PTR [RBX + 16], 0
mov QWORD PTR [RBX + 8], 0
mov QWORD PTR [RBX], 0
add RSP, 8
pop RBX
pop RBP
ret
.cfi_endproc
.section __TEXT,__literal8,8byte_literals
.align 3
LCPI4_0:
.quad 4641240890982006784 ## double 200
LCPI4_1:
.quad 4643985272004935680 ## double 300
.section __TEXT,__text,regular,pure_instructions
.globl _main
.align 4, 0x90
_main: ## #main
.cfi_startproc
## BB#0: ## %entry
push RBP
Ltmp34:
.cfi_def_cfa_offset 16
Ltmp35:
.cfi_offset rbp, -16
mov RBP, RSP
Ltmp36:
.cfi_def_cfa_register rbp
lea RDI, QWORD PTR [RIP + L_.str]
movsd XMM0, QWORD PTR [RIP + LCPI4_0]
movsd XMM1, QWORD PTR [RIP + LCPI4_1]
mov ESI, 100
mov AL, 2
call _printf
xor EAX, EAX
pop RBP
ret
.cfi_endproc
.section __TEXT,__cstring,cstring_literals
L_.str: ## #.str
.asciz "%d, %f, %f"
.subsections_via_symbols
command
clang++ -S -O2 -std=c++11 -mllvm --x86-asm-syntax=intel -fno-exceptions main.cpp
As you can see, there are two part of codes are same(constructor and destructor)
__ZN13trivialStructC1Ev: ## #_ZN13trivialStructC1Ev
__ZN13trivialStructC2Ev: ## #_ZN13trivialStructC2Ev
__ZN13trivialStructD1Ev: ## #_ZN13trivialStructD1Ev
__ZN13trivialStructD2Ev: ## #_ZN13trivialStructD2Ev
I have no idea why the compiler generate two part of codes but not just one?
I am no familiar with assembly, but looks like this just make the codes become
fatter(and maybe slower).
This is part of the ABI for your platform, and escapes the standard. Both constructors and destructors can generate multiple symbols in the binary. For example, the Itanium C++ABI will generate up to 3 constructors/destructors:
complete object constructor
base object constructor
complete object allocating constructor
deleting destructor
complete object destructor
base object destructor
The different symbols take on slightly different responsibilities as the implementation might need to do different things depending on how the object is being created/destroyed. In your particular case, the code is simple enough that all constructors might generate exactly the same code, but they need to be there to comply with the ABI, and the ABI has them to enable more complex use cases.
For example, a complete object constructor will initialize virtual bases, while the base object constructor will skip this part of construction. If there is multiple/virtual inheritance and virtual functions, the vptr in the complete object may have to jump through different sets of intermediate tables depending on how this subobject is being instantiated.
If you want an explanation other than the ABI mandates, you should take a look at the documentation for your particular ABI. You can also take a look at Inside the C++ object model that even if old, contains a good description of what the problems to solve are and some of the solutions provided.