C++ [[noreturn]] function call and destructors - c++

I have some C++ code in which I must be sure that a specific destructor is called before exiting and I was wondering whether or not it was called before a [[noreturn]] function.
So I wrote this simple dummy example
#include <cstdio>
#include <cstdlib>
class A {
char *i;
public:
A() : i{new char[4]} {}
~A() { delete[] i; }
void hello() { puts(i); }
};
int func()
{
A b;
exit(1);
b.hello(); // Not reached
}
I compiled with g++ /tmp/l.cc -S -O0 and I got this assembly
.file "l.cc"
.text
.section .text._ZN1AC2Ev,"axG",#progbits,_ZN1AC5Ev,comdat
.align 2
.weak _ZN1AC2Ev
.type _ZN1AC2Ev, #function
_ZN1AC2Ev:
.LFB18:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq %rdi, -8(%rbp)
movl $4, %edi
call _Znam
movq %rax, %rdx
movq -8(%rbp), %rax
movq %rdx, (%rax)
nop
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE18:
.size _ZN1AC2Ev, .-_ZN1AC2Ev
.weak _ZN1AC1Ev
.set _ZN1AC1Ev,_ZN1AC2Ev
.text
.globl func
.type func, #function
func:
.LFB24:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
leaq -8(%rbp), %rax
movq %rax, %rdi
call _ZN1AC1Ev
movl $1, %edi
call exit
.cfi_endproc
.LFE24:
.size func, .-func
.ident "GCC: (GNU) 12.2.1 20221121 (Red Hat 12.2.1-4)"
.section .note.GNU-stack,"",#progbits
There was clearly no call to the destructor.
In this stupid case it doesn't matter much, but what if I had to close a file before exiting?

Apart from the fact that terminating a program with exit() is generally considered bad practice, you could try the following:
int func()
{
{
A b;
/* ... */
} // Leaving scope => destructing b
exit(1);
}
PS: Assuming that you aren't writing a driver, most kernels (including Microsoft Windows NT, Unix (e.g. BSD), XNU (macOS) and Linux) automatically deallocate any allocated memory as the program exits.

Related

Why doesn't `clang -S` generate assembly code for member functions?

Let's say I want to get clang to show me what assembly it generates for Node::Destroy in the following code:
struct Node {
Node* next = nullptr;
int x;
~Node() {
delete next;
}
void Destroy() {
delete this;
}
};
If I run clang++ -S foo.cc then it gives essentially empty output:
.text
.file "foo.cc"
.ident "Debian clang version 14.0.6-2"
.section ".note.GNU-stack","",#progbits
.addrsig
But if I change Destroy to a free-standing function that accepts Node* rather than a member function, then it actually does generate assembly:
.text
.file "foo.cc"
.globl _Z7DestroyP4Node # -- Begin function _Z7DestroyP4Node
.p2align 4, 0x90
.type _Z7DestroyP4Node,#function
_Z7DestroyP4Node: # #_Z7DestroyP4Node
.cfi_startproc
# %bb.0:
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
subq $16, %rsp
movq %rdi, -8(%rbp)
[...]
The same is true in compiler explorer. What's the reason for this difference?

delete operator doesn't seem to call global operator delete overload

The output of the following program is
Global new
The code doesn't seem to call operator delete. Invoking ::operator delete directly does call the function, but using the regular delete operator doesn't.
I assume it's related to the compiler optimizing away the delete call. I tried to test that by placing all sorts of code after the delete call, including resetting the a pointer to a different expression. I would assume that would make the compiler insert the delete, because otherwise we have a leak. Still - the same result.
So, I assume this is the compiler eliminating an indeed unnecessary call to delete by rather sophisticated analysis. But, I'd like to be sure. I attempted reading the output assembly from g++, but I wasn't able to understand it.
Please explain this phenomenon.
#include <iostream>
using namespace std;
void* operator new(size_t size) {
cout << "Global new\n";
return malloc(size);
}
void operator delete(void* p) {
cout << "Global delete\n";
free(p);
}
void f() {
int* x = new int;
delete x;
}
int main(int argc, char** argv) {
f();
return 0;
}
I use Mingw-w64 on Windows 7, g++ std=c++17 version 9.3.0.
EDIT:
People have responded it does produce the expected output on their machines. So possibly a compiler bug on my machine?
Output assembly if anyone's interested:
.file "main.cpp"
.text
.lcomm _ZStL8__ioinit,1,1
.section .rdata,"dr"
.LC0:
.ascii "Global new\12\0"
.text
.globl _Znwy
.def _Znwy; .scl 2; .type 32; .endef
.seh_proc _Znwy
_Znwy:
.LFB1882:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $32, %rsp
.seh_stackalloc 32
.seh_endprologue
movq %rcx, 16(%rbp)
leaq .LC0(%rip), %rdx
movq .refptr._ZSt4cout(%rip), %rcx
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
movq 16(%rbp), %rcx
call malloc
addq $32, %rsp
popq %rbp
ret
.seh_endproc
.section .rdata,"dr"
.LC1:
.ascii "Global delete\12\0"
.text
.globl _ZdlPv
.def _ZdlPv; .scl 2; .type 32; .endef
.seh_proc _ZdlPv
_ZdlPv:
.LFB1883:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $32, %rsp
.seh_stackalloc 32
.seh_endprologue
movq %rcx, 16(%rbp)
leaq .LC1(%rip), %rdx
movq .refptr._ZSt4cout(%rip), %rcx
call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
movq 16(%rbp), %rcx
call free
nop
addq $32, %rsp
popq %rbp
ret
.seh_endproc
.globl _Z1fv
.def _Z1fv; .scl 2; .type 32; .endef
.seh_proc _Z1fv
_Z1fv:
.LFB1884:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $48, %rsp
.seh_stackalloc 48
.seh_endprologue
movl $4, %ecx
call _Znwy
movq %rax, -8(%rbp)
movq -8(%rbp), %rax
testq %rax, %rax
je .L6
movl $4, %edx
movq %rax, %rcx
call _ZdlPvy
.L6:
nop
addq $48, %rsp
popq %rbp
ret
.seh_endproc
.def __main; .scl 2; .type 32; .endef
.globl main
.def main; .scl 2; .type 32; .endef
.seh_proc main
main:
.LFB1885:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $32, %rsp
.seh_stackalloc 32
.seh_endprologue
movl %ecx, 16(%rbp)
movq %rdx, 24(%rbp)
call __main
call _Z1fv
movl $0, %eax
addq $32, %rsp
popq %rbp
ret
.seh_endproc
.def __tcf_0; .scl 3; .type 32; .endef
.seh_proc __tcf_0
__tcf_0:
.LFB2377:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $32, %rsp
.seh_stackalloc 32
.seh_endprologue
leaq _ZStL8__ioinit(%rip), %rcx
call _ZNSt8ios_base4InitD1Ev
nop
addq $32, %rsp
popq %rbp
ret
.seh_endproc
.def _Z41__static_initialization_and_destruction_0ii; .scl 3; .type 32; .endef
.seh_proc _Z41__static_initialization_and_destruction_0ii
_Z41__static_initialization_and_destruction_0ii:
.LFB2376:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $32, %rsp
.seh_stackalloc 32
.seh_endprologue
movl %ecx, 16(%rbp)
movl %edx, 24(%rbp)
cmpl $1, 16(%rbp)
jne .L12
cmpl $65535, 24(%rbp)
jne .L12
leaq _ZStL8__ioinit(%rip), %rcx
call _ZNSt8ios_base4InitC1Ev
leaq __tcf_0(%rip), %rcx
call atexit
.L12:
nop
addq $32, %rsp
popq %rbp
ret
.seh_endproc
.def _GLOBAL__sub_I__Znwy; .scl 3; .type 32; .endef
.seh_proc _GLOBAL__sub_I__Znwy
_GLOBAL__sub_I__Znwy:
.LFB2378:
pushq %rbp
.seh_pushreg %rbp
movq %rsp, %rbp
.seh_setframe %rbp, 0
subq $32, %rsp
.seh_stackalloc 32
.seh_endprologue
movl $65535, %edx
movl $1, %ecx
call _Z41__static_initialization_and_destruction_0ii
nop
addq $32, %rsp
popq %rbp
ret
.seh_endproc
.section .ctors,"w"
.align 8
.quad _GLOBAL__sub_I__Znwy
.ident "GCC: (Rev1, Built by MSYS2 project) 9.3.0"
.def _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc; .scl 2; .type 32; .endef
.def malloc; .scl 2; .type 32; .endef
.def free; .scl 2; .type 32; .endef
.def _ZdlPvy; .scl 2; .type 32; .endef
.def _ZNSt8ios_base4InitD1Ev; .scl 2; .type 32; .endef
.def _ZNSt8ios_base4InitC1Ev; .scl 2; .type 32; .endef
.def atexit; .scl 2; .type 32; .endef
.section .rdata$.refptr._ZSt4cout, "dr"
.globl .refptr._ZSt4cout
.linkonce discard
.refptr._ZSt4cout:
.quad _ZSt4cout

Deciphering the text and data segments from gcc assembly output

I am trying to examine the use of data and text segments in memory via a simple program, named source1.cpp:
int main()
{
const char* b="Hello everyone!";
int a=100;
return 0;
}
I generated the assembly by issuing gcc -S source1.cpp, and here is the output:
.file "source1.cpp"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $48, %rsp
movq %fs:40, %rax
movq %rax, -8(%rbp)
xorl %eax, %eax
movabsq $8531260732055774536, %rax
movq %rax, -32(%rbp)
movabsq $9400199222489701, %rax
movq %rax, -24(%rbp)
movl $100, -36(%rbp)
movl $0, %eax
movq -8(%rbp), %rdx
xorq %fs:40, %rdx
je .L3
call __stack_chk_fail
.L3:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.2) 5.4.0 20160609"
.section .note.GNU-stack,"",#progbits
Could anyone tell me how to figure out the text and data segments, or documentation that might help me in this?

new [], delete [] complexity

I already know that the new[] operator first allocates memory and then calls the constructor for each element and that the delete[] operator first calls the destructor for each element and then frees memory and, because of that, they both have an O(n) time complexity.
But if I have a class, for which I have not defined any constructor/destructor, will the complexity still be O(n), or will it be just O(1)?
For instance, if I have two classes:
class foo
{
public:
int a;
foo()
{
a = 0;
// more stuff
}
~foo()
{
a = 1;
// some useful stuff here
}
};
class boo
{
public:
int a;
};
And I create two arrays of them like this:
int n = 1000;
foo* pfoo = new foo[n];
boo* pboo = new boo[n];
I'm pretty sure the first new call will have an O(n) complexity, but what about the second? Will new just allocate the necessary memory and that's it, or will it call some default constructor (I'm not sure if such thing actually exits in C++) for each element?
And the same question for delete:
delete [] pfoo;
delete [] pboo;
When I delete the second array will the complexity still be O(n), or will delete just deallocate the memory in O(1) complexity?
When you don't know, it's great idea to use assembly output. For example, let's assume this is the code to compare.
class foo
{
public:
int a;
foo()
{
a = 0;
// more stuff
}
~foo()
{
a = 1;
// some useful stuff here
}
};
class boo
{
public:
int a;
};
void remove_foo(foo* pfoo) {
delete [] pfoo;
}
void remove_boo(boo *pboo) {
delete [] pboo;
}
When compiling with optimizations using gcc (Clang gives similar output), you get the following result.
.file "deleter.cpp"
.text
.p2align 4,,15
.globl _Z10remove_fooP3foo
.type _Z10remove_fooP3foo, #function
_Z10remove_fooP3foo:
.LFB6:
.cfi_startproc
testq %rdi, %rdi
je .L1
movq -8(%rdi), %rax
leaq (%rdi,%rax,4), %rax
cmpq %rax, %rdi
je .L4
.p2align 4,,10
.p2align 3
.L6:
subq $4, %rax
movl $1, (%rax)
cmpq %rax, %rdi
jne .L6
.L4:
subq $8, %rdi
jmp _ZdaPv
.p2align 4,,10
.p2align 3
.L1:
rep ret
.cfi_endproc
.LFE6:
.size _Z10remove_fooP3foo, .-_Z10remove_fooP3foo
.p2align 4,,15
.globl _Z10remove_booP3boo
.type _Z10remove_booP3boo, #function
_Z10remove_booP3boo:
.LFB7:
.cfi_startproc
testq %rdi, %rdi
je .L8
jmp _ZdaPv
.p2align 4,,10
.p2align 3
.L8:
rep ret
.cfi_endproc
.LFE7:
.size _Z10remove_booP3boo, .-_Z10remove_booP3boo
.ident "GCC: (SUSE Linux) 4.8.1 20130909 [gcc-4_8-branch revision 202388]"
.section .note.GNU-stack,"",#progbits
It's easy to tell that for foo it calls destructor, but for boo it directly calls delete [] function (_ZdaPv after name mangling). This also happens without optimizations. The code is longer, because methods are actually output, but it's still noticeable that delete [] is called directly for boo.
.file "deleter.cpp"
.section .text._ZN3fooD2Ev,"axG",#progbits,_ZN3fooD5Ev,comdat
.align 2
.weak _ZN3fooD2Ev
.type _ZN3fooD2Ev, #function
_ZN3fooD2Ev:
.LFB4:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
movq -8(%rbp), %rax
movl $1, (%rax)
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE4:
.size _ZN3fooD2Ev, .-_ZN3fooD2Ev
.weak _ZN3fooD1Ev
.set _ZN3fooD1Ev,_ZN3fooD2Ev
.text
.globl _Z10remove_fooP3foo
.type _Z10remove_fooP3foo, #function
_Z10remove_fooP3foo:
.LFB6:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
pushq %rbx
subq $24, %rsp
.cfi_offset 3, -24
movq %rdi, -24(%rbp)
cmpq $0, -24(%rbp)
je .L3
movq -24(%rbp), %rax
subq $8, %rax
movq (%rax), %rax
leaq 0(,%rax,4), %rdx
movq -24(%rbp), %rax
leaq (%rdx,%rax), %rbx
.L6:
cmpq -24(%rbp), %rbx
je .L5
subq $4, %rbx
movq %rbx, %rdi
call _ZN3fooD1Ev
jmp .L6
.L5:
movq -24(%rbp), %rax
subq $8, %rax
movq %rax, %rdi
call _ZdaPv
.L3:
addq $24, %rsp
popq %rbx
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE6:
.size _Z10remove_fooP3foo, .-_Z10remove_fooP3foo
.globl _Z10remove_booP3boo
.type _Z10remove_booP3boo, #function
_Z10remove_booP3boo:
.LFB7:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq %rdi, -8(%rbp)
cmpq $0, -8(%rbp)
je .L7
movq -8(%rbp), %rax
movq %rax, %rdi
call _ZdaPv
.L7:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE7:
.size _Z10remove_booP3boo, .-_Z10remove_booP3boo
.ident "GCC: (SUSE Linux) 4.8.1 20130909 [gcc-4_8-branch revision 202388]"
.section .note.GNU-stack,"",#progbits
This also applies to new []. _Znam is called directly, without constructing objects, even without optimizations.
Generally, that means custom constructors or destructors mean that new [] and delete [] won't be executed in constant time. But if there aren't, the compiler doesn't try to call constructors or destructors for these, and they will be POD data types, which means that constructing these objects is simple malloc-like call. There are some exceptions (involving various optimizations), but usually code with constructors/destructors will be O(N), and without will be O(1), assuming O(1) new []/delete [] implementation.
It depends on your exact syntax:
auto x = new unsigned[2];
auto y = new unsigned[2]();
::std::cout << x[0] << "\n" << x[1] << "\n" << y[0] << "\n" << y[1] << "\n";
delete[] x;
delete[] y;
gives the output (on my machine):
3452816845
3452816845
0
0
Because one will be default initialized and the other value initialized.
delete[] on the other hand is even simpler to understand: If your data type has a destructor, it will be called. The built in (and thus POD) types generally do not.
MyClass *p = static_cast<MyClass*> (::operator new (sizeof(MyClass[N])));
Allocates memory for N objects and does not construct them. In that way the complexity will be the same as malloc(). It will be obviously faster, then allocating and constructing objects of complex class.
But if I have a class, for which I have not defined any constructor/destructor, will the complexity still be O(n), or will it be just O(1)?
The members themselves might still have destructors. In short, for PODs, delete[] will be O(1).

g++ incorrect loop?

I have a real world program that is similar to this one, which I'll call test.cpp:
#include <stdlib.h>
extern void f(size_t i);
int sample(size_t x)
{
size_t a = x;
size_t i;
for (i = a-2; i>=0; i--) {
f(i);
}
}
And my problem is that i is an infinite loop.
If I run the following command:
g++ -S -o test.s test.cpp
I get the following assembly sequence:
.file "test.cpp"
.text
.globl _Z6samplem
.type _Z6samplem, #function
_Z6samplem:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movq %rdi, -24(%rbp)
movq -24(%rbp), %rax
movq %rax, -8(%rbp)
movq -8(%rbp), %rax
subq $2, %rax
movq %rax, -16(%rbp)
.L2:
movq -16(%rbp), %rax
movq %rax, %rdi
call _Z1fm
subq $1, -16(%rbp)
jmp .L2
.cfi_endproc
.LFE0:
.size _Z6samplem, .-_Z6samplem
.ident "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section .note.GNU-stack,"",#progbits
I'm no expert in assembly language, but I would expect to see code for the comparison i >= 0 and a conditional jump out of the loop. What's going on here??
GNU C++ 4.6.3 on Ubuntu Linux
size_t is unsigned, so the condition i>=0 is always true. It is impossible for i to be negative.