I'm stuck with a problem which is illustrated by the following g++ code:
frob.hpp:
template<typename T> T frob(T x);
template<> inline int frob<int>(int x) {
asm("1: nop\n"
".pushsection \"extra\",\"a\"\n"
".quad 1b\n"
".popsection\n");
return x+1;
}
foo.cpp:
#include "frob.hpp"
extern int bar();
int foo() { return frob(17); }
int main() { return foo() + bar(); }
bar.cpp:
#include "frob.hpp"
int bar() { return frob(42); }
I'm doing these quirky custom section things as a way to mimick the mechanism here in the linux kernel (but in a userland and C++ way).
My problem is that the instantiation of frob<int> is recognized as a weak symbol, which is fine, and one of the two is eventually elided by the linker, which is fine too. Except that the linker is not disturbed by the fact that the extra section has references to that symbol (via .quad 1b), and the linker want to resolve them locally. I get:
localhost /tmp $ g++ -O3 foo.cpp bar.cpp
localhost /tmp $ g++ -O0 foo.cpp bar.cpp
`.text._Z4frobIiET_S0_' referenced in section `extra' of /tmp/ccr5s7Zg.o: defined in discarded section `.text._Z4frobIiET_S0_[_Z4frobIiET_S0_]' of /tmp/ccr5s7Zg.o
collect2: error: ld returned 1 exit status
(-O3 is fine because no symbol is emitted altogether).
I don't know how to work around this.
would there be a way to tell the linker to also pay attention to symbol resolution in the extra section too ?
perhaps one could trade the local labels for .weak global labels ? E.g. like in:
asm(".weak exception_handler_%=\n"
"exception_handler_%=: nop\n"
".pushsection \"extra\",\"a\"\n"
".quad exception_handler_%=\n"
".popsection\n"::);
However I fear that if I go this way, distinct asm statements in distinct compilation units may get the same symbol via this mechanism (may they ?).
Is there a way around that I've overlooked ?
g++ (5,6, at least) compiles an inline function with external linkage - such as
template<> inline int frob<int>(int x) - at a weak global
symbol in a [COMDAT] [function-section] in
its own section-group. See:-
g++ -S -O0 bar.cpp
bar.s
.file "bar.cpp"
.section .text._Z4frobIiET_S0_,"axG",#progbits,_Z4frobIiET_S0_,comdat
.weak _Z4frobIiET_S0_
.type _Z4frobIiET_S0_, #function
_Z4frobIiET_S0_:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl %edi, -4(%rbp)
#APP
# 8 "frob.hpp" 1
1: nop
.pushsection "extra","a"
.quad 1b
.popsection
# 0 "" 2
#NO_APP
movl -4(%rbp), %eax
addl $1, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
...
...
The relevant directives are:
.section .text._Z4frobIiET_S0_,"axG",#progbits,_Z4frobIiET_S0_,comdat
.weak _Z4frobIiET_S0_
(The compiler-generated #APP and #NO_APP delimit your inline assembly).
Do as the compiler does by making extra likewise a COMDAT section in
a section-group:
frob.hpp (fixed)
template<typename T> T frob(T x);
template<> inline int frob<int>(int x) {
asm("1: nop\n"
".pushsection \"extra\", \"axG\", #progbits,extra,comdat" "\n"
".quad 1b\n"
".popsection\n");
return x+1;
}
and the linkage error will be cured:
$ g++ -O0 foo.cpp bar.cpp
$ ./a.out; echo $?
61
Related
Today, I discovered a rather interesting thing about either g++ or nm...constructor definitions appear to have two entries in libraries.
I have a header thing.hpp:
class Thing
{
Thing();
Thing(int x);
void foo();
};
And thing.cpp:
#include "thing.hpp"
Thing::Thing()
{ }
Thing::Thing(int x)
{ }
void Thing::foo()
{ }
I compile this with:
g++ thing.cpp -c -o libthing.a
Then, I run nm on it:
%> nm -gC libthing.a
0000000000000030 T Thing::foo()
0000000000000022 T Thing::Thing(int)
000000000000000a T Thing::Thing()
0000000000000014 T Thing::Thing(int)
0000000000000000 T Thing::Thing()
U __gxx_personality_v0
As you can see, both of the constructors for Thing are listed with two entries in the generated static library. My g++ is 4.4.3, but the same behavior happens in clang, so it isn't just a gcc issue.
This doesn't cause any apparent problems, but I was wondering:
Why are defined constructors listed twice?
Why doesn't this cause "multiple definition of symbol __" problems?
EDIT: For Carl, the output without the C argument:
%> nm -g libthing.a
0000000000000030 T _ZN5Thing3fooEv
0000000000000022 T _ZN5ThingC1Ei
000000000000000a T _ZN5ThingC1Ev
0000000000000014 T _ZN5ThingC2Ei
0000000000000000 T _ZN5ThingC2Ev
U __gxx_personality_v0
As you can see...the same function is generating multiple symbols, which is still quite curious.
And while we're at it, here is a section of generated assembly:
.globl _ZN5ThingC2Ev
.type _ZN5ThingC2Ev, #function
_ZN5ThingC2Ev:
.LFB1:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
leave
ret
.cfi_endproc
.LFE1:
.size _ZN5ThingC2Ev, .-_ZN5ThingC2Ev
.align 2
.globl _ZN5ThingC1Ev
.type _ZN5ThingC1Ev, #function
_ZN5ThingC1Ev:
.LFB2:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
leave
ret
.cfi_endproc
So the generated code is...well...the same.
EDIT: To see what constructor actually gets called, I changed Thing::foo() to this:
void Thing::foo()
{
Thing t;
}
The generated assembly is:
.globl _ZN5Thing3fooEv
.type _ZN5Thing3fooEv, #function
_ZN5Thing3fooEv:
.LFB550:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
subq $48, %rsp
movq %rdi, -40(%rbp)
leaq -32(%rbp), %rax
movq %rax, %rdi
call _ZN5ThingC1Ev
leaq -32(%rbp), %rax
movq %rax, %rdi
call _ZN5ThingD1Ev
leave
ret
.cfi_endproc
So it is invoking the complete object constructor.
We'll start by declaring that GCC follows the Itanium C++ ABI.
According to the ABI, the mangled name for your Thing::foo() is easily parsed:
_Z | N | 5Thing | 3foo | E | v
prefix | nested | `Thing` | `foo`| end nested | parameters: `void`
You can read the constructor names similarly, as below. Notice how the constructor "name" isn't given, but instead a C clause:
_Z | N | 5Thing | C1 | E | i
prefix | nested | `Thing` | Constructor | end nested | parameters: `int`
But what's this C1? Your duplicate has C2. What does this mean?
Well, this is quite simple too:
<ctor-dtor-name> ::= C1 # complete object constructor
::= C2 # base object constructor
::= C3 # complete object allocating constructor
::= D0 # deleting destructor
::= D1 # complete object destructor
::= D2 # base object destructor
Wait, why is this simple? This class has no base. Why does it have a "complete object constructor" and a "base object constructor" for each?
This Q&A implies to me that this is simply a by-product of polymorphism support, even though it's not actually required in this case.
Note that c++filt used to include this information in its demangled output, but doesn't any more.
This forum post asks the same question, and the only response doesn't do any better at answering it, except for the implication that GCC could avoid emitting two constructors when polymorphism is not involved, and that this behaviour ought to be improved in the future.
This newsgroup posting describes a problem with setting breakpoints in constructors due to this dual-emission. It's stated again that the root of the issue is support for polymorphism.
In fact, this is listed as a GCC "known issue":
G++ emits two copies of constructors and destructors.
In general there are three types of constructors (and
destructors).
The complete object constructor/destructor.
The base object constructor/destructor.
The allocating constructor/deallocating destructor.
The first two are different, when virtual base classes are
involved.
The meaning of these different constructors seems to be as follows:
The "complete object constructor". It additionally constructs virtual base classes.
The "base object constructor". It creates the object itself, as well as data members and non-virtual base classes.
The "allocating object constructor". It does everything the complete object constructor does, plus it calls operator new to actually allocate the memory... but apparently this is not usually seen.
If you have no virtual base classes, [the first two] are are
identical; GCC will, on sufficient optimization levels, actually alias
the symbols to the same code for both.
Today, I discovered a rather interesting thing about either g++ or nm...constructor definitions appear to have two entries in libraries.
I have a header thing.hpp:
class Thing
{
Thing();
Thing(int x);
void foo();
};
And thing.cpp:
#include "thing.hpp"
Thing::Thing()
{ }
Thing::Thing(int x)
{ }
void Thing::foo()
{ }
I compile this with:
g++ thing.cpp -c -o libthing.a
Then, I run nm on it:
%> nm -gC libthing.a
0000000000000030 T Thing::foo()
0000000000000022 T Thing::Thing(int)
000000000000000a T Thing::Thing()
0000000000000014 T Thing::Thing(int)
0000000000000000 T Thing::Thing()
U __gxx_personality_v0
As you can see, both of the constructors for Thing are listed with two entries in the generated static library. My g++ is 4.4.3, but the same behavior happens in clang, so it isn't just a gcc issue.
This doesn't cause any apparent problems, but I was wondering:
Why are defined constructors listed twice?
Why doesn't this cause "multiple definition of symbol __" problems?
EDIT: For Carl, the output without the C argument:
%> nm -g libthing.a
0000000000000030 T _ZN5Thing3fooEv
0000000000000022 T _ZN5ThingC1Ei
000000000000000a T _ZN5ThingC1Ev
0000000000000014 T _ZN5ThingC2Ei
0000000000000000 T _ZN5ThingC2Ev
U __gxx_personality_v0
As you can see...the same function is generating multiple symbols, which is still quite curious.
And while we're at it, here is a section of generated assembly:
.globl _ZN5ThingC2Ev
.type _ZN5ThingC2Ev, #function
_ZN5ThingC2Ev:
.LFB1:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
leave
ret
.cfi_endproc
.LFE1:
.size _ZN5ThingC2Ev, .-_ZN5ThingC2Ev
.align 2
.globl _ZN5ThingC1Ev
.type _ZN5ThingC1Ev, #function
_ZN5ThingC1Ev:
.LFB2:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
leave
ret
.cfi_endproc
So the generated code is...well...the same.
EDIT: To see what constructor actually gets called, I changed Thing::foo() to this:
void Thing::foo()
{
Thing t;
}
The generated assembly is:
.globl _ZN5Thing3fooEv
.type _ZN5Thing3fooEv, #function
_ZN5Thing3fooEv:
.LFB550:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
subq $48, %rsp
movq %rdi, -40(%rbp)
leaq -32(%rbp), %rax
movq %rax, %rdi
call _ZN5ThingC1Ev
leaq -32(%rbp), %rax
movq %rax, %rdi
call _ZN5ThingD1Ev
leave
ret
.cfi_endproc
So it is invoking the complete object constructor.
We'll start by declaring that GCC follows the Itanium C++ ABI.
According to the ABI, the mangled name for your Thing::foo() is easily parsed:
_Z | N | 5Thing | 3foo | E | v
prefix | nested | `Thing` | `foo`| end nested | parameters: `void`
You can read the constructor names similarly, as below. Notice how the constructor "name" isn't given, but instead a C clause:
_Z | N | 5Thing | C1 | E | i
prefix | nested | `Thing` | Constructor | end nested | parameters: `int`
But what's this C1? Your duplicate has C2. What does this mean?
Well, this is quite simple too:
<ctor-dtor-name> ::= C1 # complete object constructor
::= C2 # base object constructor
::= C3 # complete object allocating constructor
::= D0 # deleting destructor
::= D1 # complete object destructor
::= D2 # base object destructor
Wait, why is this simple? This class has no base. Why does it have a "complete object constructor" and a "base object constructor" for each?
This Q&A implies to me that this is simply a by-product of polymorphism support, even though it's not actually required in this case.
Note that c++filt used to include this information in its demangled output, but doesn't any more.
This forum post asks the same question, and the only response doesn't do any better at answering it, except for the implication that GCC could avoid emitting two constructors when polymorphism is not involved, and that this behaviour ought to be improved in the future.
This newsgroup posting describes a problem with setting breakpoints in constructors due to this dual-emission. It's stated again that the root of the issue is support for polymorphism.
In fact, this is listed as a GCC "known issue":
G++ emits two copies of constructors and destructors.
In general there are three types of constructors (and
destructors).
The complete object constructor/destructor.
The base object constructor/destructor.
The allocating constructor/deallocating destructor.
The first two are different, when virtual base classes are
involved.
The meaning of these different constructors seems to be as follows:
The "complete object constructor". It additionally constructs virtual base classes.
The "base object constructor". It creates the object itself, as well as data members and non-virtual base classes.
The "allocating object constructor". It does everything the complete object constructor does, plus it calls operator new to actually allocate the memory... but apparently this is not usually seen.
If you have no virtual base classes, [the first two] are are
identical; GCC will, on sufficient optimization levels, actually alias
the symbols to the same code for both.
I've seen a few tools like Pin and DynInst that do dynamic code manipulation in order to instrument code without having to recompile. These seem like heavyweight solutions to what seems like it should be a straightforward problem: retrieving accurate function call data from a program.
I want to write something such that in my code, I can write
void SomeFunction() {
StartProfiler();
...
StopProfiler();
}
and post-execution, retrieve data about what functions were called between StartProfiler() and StopProfiler() (the whole call tree) and how long each of them took.
Preferably I could read out debug symbols too, to get function names instead of addresses.
Here's one interesting hint at a solution I discovered.
gcc (and llvm>=3.0) has a -pg option when compiling, which is traditionally for gprof support. When you compile your code with this flag, the compiler adds a call to the function mcount to the beginning of every function definition. You can override this function, but you'll need to do it in assembly, otherwise the mcount function you define will be instrumented with a call to mcount and you'll quickly run out of stack space before main even gets called.
Here's a little proof of concept:
foo.c:
int total_calls = 0;
void foo(int c) {
if (c > 0)
foo(c-1);
}
int main() {
foo(4);
printf("%d\n", total_calls);
}
foo.s:
.globl mcount
mcount:
movl _total_calls(%rip), %eax
addl $1, %eax
movl %eax, _total_calls(%rip)
ret
compile with clang -pg foo.s foo.c -o foo. Result:
$ ./foo
6
That's 1 for main, 4 for foo and 1 for printf.
Here's the asm that clang emits for foo:
_foo:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movl %edi, -8(%rbp) ## 4-byte Spill
callq mcount
movl -8(%rbp), %edi ## 4-byte Reload
...
Today, I discovered a rather interesting thing about either g++ or nm...constructor definitions appear to have two entries in libraries.
I have a header thing.hpp:
class Thing
{
Thing();
Thing(int x);
void foo();
};
And thing.cpp:
#include "thing.hpp"
Thing::Thing()
{ }
Thing::Thing(int x)
{ }
void Thing::foo()
{ }
I compile this with:
g++ thing.cpp -c -o libthing.a
Then, I run nm on it:
%> nm -gC libthing.a
0000000000000030 T Thing::foo()
0000000000000022 T Thing::Thing(int)
000000000000000a T Thing::Thing()
0000000000000014 T Thing::Thing(int)
0000000000000000 T Thing::Thing()
U __gxx_personality_v0
As you can see, both of the constructors for Thing are listed with two entries in the generated static library. My g++ is 4.4.3, but the same behavior happens in clang, so it isn't just a gcc issue.
This doesn't cause any apparent problems, but I was wondering:
Why are defined constructors listed twice?
Why doesn't this cause "multiple definition of symbol __" problems?
EDIT: For Carl, the output without the C argument:
%> nm -g libthing.a
0000000000000030 T _ZN5Thing3fooEv
0000000000000022 T _ZN5ThingC1Ei
000000000000000a T _ZN5ThingC1Ev
0000000000000014 T _ZN5ThingC2Ei
0000000000000000 T _ZN5ThingC2Ev
U __gxx_personality_v0
As you can see...the same function is generating multiple symbols, which is still quite curious.
And while we're at it, here is a section of generated assembly:
.globl _ZN5ThingC2Ev
.type _ZN5ThingC2Ev, #function
_ZN5ThingC2Ev:
.LFB1:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
leave
ret
.cfi_endproc
.LFE1:
.size _ZN5ThingC2Ev, .-_ZN5ThingC2Ev
.align 2
.globl _ZN5ThingC1Ev
.type _ZN5ThingC1Ev, #function
_ZN5ThingC1Ev:
.LFB2:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
movq %rdi, -8(%rbp)
leave
ret
.cfi_endproc
So the generated code is...well...the same.
EDIT: To see what constructor actually gets called, I changed Thing::foo() to this:
void Thing::foo()
{
Thing t;
}
The generated assembly is:
.globl _ZN5Thing3fooEv
.type _ZN5Thing3fooEv, #function
_ZN5Thing3fooEv:
.LFB550:
.cfi_startproc
.cfi_personality 0x3,__gxx_personality_v0
pushq %rbp
.cfi_def_cfa_offset 16
movq %rsp, %rbp
.cfi_offset 6, -16
.cfi_def_cfa_register 6
subq $48, %rsp
movq %rdi, -40(%rbp)
leaq -32(%rbp), %rax
movq %rax, %rdi
call _ZN5ThingC1Ev
leaq -32(%rbp), %rax
movq %rax, %rdi
call _ZN5ThingD1Ev
leave
ret
.cfi_endproc
So it is invoking the complete object constructor.
We'll start by declaring that GCC follows the Itanium C++ ABI.
According to the ABI, the mangled name for your Thing::foo() is easily parsed:
_Z | N | 5Thing | 3foo | E | v
prefix | nested | `Thing` | `foo`| end nested | parameters: `void`
You can read the constructor names similarly, as below. Notice how the constructor "name" isn't given, but instead a C clause:
_Z | N | 5Thing | C1 | E | i
prefix | nested | `Thing` | Constructor | end nested | parameters: `int`
But what's this C1? Your duplicate has C2. What does this mean?
Well, this is quite simple too:
<ctor-dtor-name> ::= C1 # complete object constructor
::= C2 # base object constructor
::= C3 # complete object allocating constructor
::= D0 # deleting destructor
::= D1 # complete object destructor
::= D2 # base object destructor
Wait, why is this simple? This class has no base. Why does it have a "complete object constructor" and a "base object constructor" for each?
This Q&A implies to me that this is simply a by-product of polymorphism support, even though it's not actually required in this case.
Note that c++filt used to include this information in its demangled output, but doesn't any more.
This forum post asks the same question, and the only response doesn't do any better at answering it, except for the implication that GCC could avoid emitting two constructors when polymorphism is not involved, and that this behaviour ought to be improved in the future.
This newsgroup posting describes a problem with setting breakpoints in constructors due to this dual-emission. It's stated again that the root of the issue is support for polymorphism.
In fact, this is listed as a GCC "known issue":
G++ emits two copies of constructors and destructors.
In general there are three types of constructors (and
destructors).
The complete object constructor/destructor.
The base object constructor/destructor.
The allocating constructor/deallocating destructor.
The first two are different, when virtual base classes are
involved.
The meaning of these different constructors seems to be as follows:
The "complete object constructor". It additionally constructs virtual base classes.
The "base object constructor". It creates the object itself, as well as data members and non-virtual base classes.
The "allocating object constructor". It does everything the complete object constructor does, plus it calls operator new to actually allocate the memory... but apparently this is not usually seen.
If you have no virtual base classes, [the first two] are are
identical; GCC will, on sufficient optimization levels, actually alias
the symbols to the same code for both.
I’m currently working on a reporting library as part of a large project. It contains a collection of logging and system message functions. I’m trying to utilize preprocessor macros to strip out a subset of the functions calls that are intended strictly for debugging, and the function definitions and implementations themselves, using conditional compilation and function like macros defined to nothing (similar to the way that assert() calls are removed if DEBUG is defined). I’m running into a problem. I prefer to fully qualify namespaces, I find it improves readability; and I have my reporting functions wrapped in a namespace. Because the colon character can’t be part of a macro token I am unable to include the namespace in the stripping of the function calls. If I defined the functions alone to nothing I end up with Namespace::. I've considered just using conditional compilation to block the function code for those functions, but I am worried that the compiler might not competently optimize out the empty functions.
namespace Reporting
{
const extern std::string logFileName;
void Report(std::string msg);
void Report(std::string msg, std::string msgLogAdd);
void Log(std::string msg);
void Message(std::string msg);
#ifdef DEBUG
void Debug_Log(std::string message);
void Debug_Message(std::string message);
void Debug_Report(std::string message);
void Debug_Assert(bool test, std::string message);
#else
#define Debug_Log(x);
#define Debug_Message(x);
#define Debug_Report(x);
#define Debug_Assert(x);
#endif
};
Any idea on how to deal with the namespace qualifiers with the preprocessor?
Thoughts on, problems with, just removing the function code?
Any other ways to accomplish my goal?
This is how I did it when I wrote a similar library several months back. And yes, your optimizer will remove empty, inline function calls. If you declare them out-of-line (not in the header file), your compiler will NOT inline them unless you use LTO.
namespace Reporting
{
const extern std::string logFileName;
void Report(std::string msg);
void Report(std::string msg, std::string msgLogAdd);
void Log(std::string msg);
void Message(std::string msg);
#ifdef DEBUG
inline void Debug_Log(std::string message) { return Log(message); }
inline void Debug_Message(std::string message) { return Message(message); }
inline void Debug_Report(std::string message) { return Report(message); }
inline void Debug_Assert(bool test, std::string message) { /* Not sure what to do here */ }
#else
inline void Debug_Log(std::string) {}
inline void Debug_Message(std::string) {}
inline void Debug_Report(std::string) {}
inline void Debug_Assert(std::string) {}
#endif
};
Just as a side note, don't pass strings by value unless you need to make a copy anyways. Use a const reference instead. It prevents an expensive allocation + strcpy on the string for EVERY function call.
EDIT: Actually, now that I think about it, just use a const char*. Looking at the assembly, it's a LOT faster, especially for empty function bodies.
GCC optimizes this out at -O1, I don't think there's much of an issue with this:
clark#clark-laptop /tmp $ cat t.cpp
#include <cstdio>
inline void do_nothing()
{
}
int main()
{
do_nothing();
return 0;
}
clark#clark-laptop /tmp $ g++ -O1 -S t.cpp
clark#clark-laptop /tmp $ cat t.s
.file "t.cpp"
.text
.globl main
.type main, #function
main:
.LFB32:
.cfi_startproc
movl $0, %eax
ret
.cfi_endproc
.LFE32:
.size main, .-main
.ident "GCC: (Gentoo 4.5.0 p1.2, pie-0.4.5) 4.5.0"
.section .note.GNU-stack,"",#progbits
After a bit of tweaking, it seems that this will only be a FULL removal if you use const char*, NOT std::string or const std::string&. Here's the assembly for the const char*:
clark#clark-laptop /tmp $ cat t.cpp
inline void do_nothing(const char*)
{
}
int main()
{
do_nothing("test");
return 0;
}
clark#clark-laptop /tmp $ g++ -O1 -S t.cpp
clark#clark-laptop /tmp $ cat t.s
.file "t.cpp"
.text
.globl main
.type main, #function
main:
.LFB1:
.cfi_startproc
movl $0, %eax
ret
.cfi_endproc
.LFE1:
.size main, .-main
.ident "GCC: (Gentoo 4.5.0 p1.2, pie-0.4.5) 4.5.0"
.section .note.GNU-stack,"",#progbits
And here's with const std::string&...
.file "t.cpp"
.section .rodata.str1.1,"aMS",#progbits,1
.LC0:
.string "test"
.text
.globl main
.type main, #function
main:
.LFB591:
.cfi_startproc
subq $24, %rsp
.cfi_def_cfa_offset 32
leaq 14(%rsp), %rdx
movq %rsp, %rdi
movl $.LC0, %esi
call _ZNSsC1EPKcRKSaIcE
movq (%rsp), %rdi
subq $24, %rdi
cmpq $_ZNSs4_Rep20_S_empty_rep_storageE, %rdi
je .L11
movl $_ZL22__gthrw_pthread_cancelm, %eax
testq %rax, %rax
je .L3
movl $-1, %eax
lock xaddl %eax, 16(%rdi)
jmp .L4
.L3:
movl 16(%rdi), %eax
leal -1(%rax), %edx
movl %edx, 16(%rdi)
.L4:
testl %eax, %eax
jg .L11
leaq 15(%rsp), %rsi
call _ZNSs4_Rep10_M_destroyERKSaIcE
.L11:
movl $0, %eax
addq $24, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE591:
.size main, .-main
[Useless stuff removed...]
.ident "GCC: (Gentoo 4.5.0 p1.2, pie-0.4.5) 4.5.0"
.section .note.GNU-stack,"",#progbits
Huge difference, eh?
I am not sure if I fully understand your problem. Would the following help?
namespace X
{
namespace{int dummy;}
void debug_check(int);
}
#ifdef DEBUG
#define DEBUG_CHECK(ARG) debug_check(ARG)
#else
#define DEBUG_CHECK(ARG) dummy // just ignore args
#endif
int main()
{
X::DEBUG_CHECK(1);
}
This solution might not work, because it can generate a "statement without effect" warning. A potentially better solution would be to gobble the namespace prefix up in a function declaration:
// debug_check and "#ifdef DEBUG" part omitted
namespace X
{
typedef void dummy_type;
}
namespace Y
{
typedef void dummy_type;
}
typedef void dummy_type;
#define DEBUG(X) dummy_type dummy_fn();
int main()
{
X::DEBUG(1);
Y::DEBUG(2);
X::DEBUG(3);
Y::DEBUG(4);
DEBUG(5);
DEBUG(6);
};
As long as any definition of dummy_type yields the same type, this should legal, because typedefs are not distinct types.
you could just have your logging function replaced by a function that does nothing, no?
I know that this questions has been answered since ages, but I came across this problem when I put a log macro into a namespace. You were suggesting empty functions and optimization levels. Clark Gaebles made me think, because of the different results using const char*or const std::string&. The following code gives me no reasonable changes in assembly with no optimization levels enabled:
#include <iostream>
#undef _DEBUG // undefine to use __NOJOB
namespace Debug
{
typedef void __NOJOB;
class Logger
{
public:
static void Log( const char* msg, const char* file, int line )
{
std::cout << "Log: " << msg << " in " <<
file << ":" << line << std::endl;
}
};
}
#ifdef _DEBUG
#define Log( msg ) Logger::Log( msg, __FILE__, __LINE__ );
#else
#define Log( msg )__NOJOB(0);
#endif
int main()
{
Debug::Log( "please skip me" );
return 0;
}
created assembly by http://assembly.ynh.io/:
main:
.LFB972:
.cfi_startproc
0000 55 pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
0001 4889E5 movq %rsp, %rbp
.cfi_def_cfa_register 6 // <- stack main
// no code for void( 0 ) here
0004 B8000000 movl $0, %eax // return
00
0009 5D popq %rbp // -> end stack main
.cfi_def_cfa 7, 8
000a C3 ret
Maybe I made an mistake or understood something wrong? Would be nice to hearing from you.