Does the (gcc) compiler optimize away empty-body functions?

Does the (gcc) compiler optimize away empty-body functions? - c++

Using policy based design, an EncapsulatedAlgorithm:
template< typename Policy>
class EncapsulatedAlgorithm : public Policy
{
double x = 0;
public:
using Policy::subCalculate;
void calculate()
{
Policy::subCalculate(x);
}
protected:
~EncapsulatedAlgorithm() = default;
};
may have a policy Policy that performs a sub-calculation. The sub-calculation is not necessary for the algorithm: it can be used in some cases to speed up algorithm convergence. So, to model that, let's say there are three policies.
One that just "logs" something:
struct log
{
static void subCalculate(double& x)
{
std::cout << "Doing the calculation" << endl;
}
};
one that calculates:
struct calculate
{
static void subCalculate(double& x)
{
x = x * x;
}
};
and one to bring them all and in the darkness bind them :D - that does absolutely nothing:
struct doNothing
{
static void subCalculate(double& x)
{
// Do nothing.
}
};
Here is the example program:
typedef EncapsulatedAlgorithm<doNothing> nothingDone;
typedef EncapsulatedAlgorithm<calculate> calculationDone;
typedef EncapsulatedAlgorithm<loggedCalculation> calculationLogged;
int main(int argc, const char *argv[])
{
nothingDone n;
n.calculate();
calculationDone c;
c.calculate();
calculationLogged l;
l.calculate();
return 0;
}
And here is the live example. I tried examining the assembly code produced by gcc with the optimization turned on:
g++ -S -O3 -std=c++11 main.cpp
but I do not know enough about Assembly to interpret the result with certainty - the resulting file was tiny and I was unable to recognize the function calls, because the code of the static functions of all policies was inlined.
What I could see is that when no optimization is set for the, within the main function, there is a call and a subsequent leave related to the 'doNothing::subCalculate'
call _ZN9doNothing12subCalculateERd
leave
Here are my questions:
Where do I start to learn in order to be able to read what g++ -S spews out?
Is the empty function optimized away or not and where in main.s are those lines?
Is this design O.K.? Usually, implementing a function that does nothing is a bad thing, as the interface is saying something completely different (subCalculate instead of doNothing), but in the case of policies, the policy name clearly states that the function will not do anything. Otherwise I need to do type traits stuff like enable_if, etc, just to exclude a single function call.

I went to http://assembly.ynh.io/, which shows assembly output. I
template< typename Policy>
struct EncapsulatedAlgorithm : public Policy
{
void calculate(double& x)
{
Policy::subCalculate(x);
}
};
struct doNothing
{
static void subCalculate(double& x)
{
}
};
void func(double& x) {
EncapsulatedAlgorithm<doNothing> a;
a.calculate(x);
}
and got these results:
.Ltext0:
.globl _Z4funcRd
_Z4funcRd:
.LFB2:
.cfi_startproc #void func(double& x) {
.LVL0:
0000 F3 rep #not sure what this is
0001 C3 ret #}
.cfi_endproc
.LFE2:
.Letext0:
Well, I only see two opcodes in the assembly there. rep (no idea what that is) and end function. It appears that the G++ compiler can easily optimize out the function bodies.

Where do I start to learn in order to be able to read what g++ -S spews out?
This site's not for recommending reading material. Google "x86 assembly language".
Is the empty function optimized away or not and where in main.s are those lines?
It will have been when the optimiser was enabled, so there won't be any lines in the generated .S. You've already found the call in the unoptimised output....
In fact, even the policy that's meant to do a multiplication may be removed as the compiler should be able to work out you're not using the resultant value. Add code to print the value of x, and seed x from some value that can't be known at compile time (it's often convenient to use argc in a little experimental program like this, then you'll be forcing the compiler to at least leave in the functionally significant code.
Is this design O.K.?
That depends on a lot of things (like whether you want to use templates given the implementation needs to be exposed in the header file, whether you want to deal with having distinct types for every instantiation...), but you're implementing the design correctly.
Usually, implementing a function that does nothing is a bad thing, as the interface is saying something completely different (subCalculate instead of doNothing), but in the case of policies, the policy name clearly states that the function will not do anything. Otherwise I need to do type traits stuff like enable_if, etc, just to exclude a single function call.
You may want to carefully consider your function names... do_any_necessary_calculations(), ensure_exclusivity() instead of lock_mutex(), after_each_value() instead of print_breaks etc..

Related

Given a call stack containing a lambda function, how can one determine its source?

Suppose you have some code that pushes onto a queue like this:
template <typename T>
void submitJobToPool(T callable)
{
someJobQueue.push(callable)
}
...and later on:
template <typename T>
void runJobFromPool(T callable)
{
auto job = someJobQueue.pop();
job();
}
Now imagine that the code crashes due to some error inside of the job() call. If the submitted job was a normal function, the call stack might look something like this:
void myFunction() 0x345678901
void runJobFromPool() 0x234567890
int main(int, char**) 0x123456789
It's easy to see what function crashed here. If it's a functor, it'll be similar but with an operator() in there somewhere (ignoring inlining). However, for a lambda...
void lambda_a7009ccf8810b62b59083b4c1779e569() 0x345678901
void runJobFromPool() 0x234567890
int main(int, char**) 0x123456789
This is not so easy to debug. If there's a debugger attached when it happens, or a core dump available, then that information can be used to derive which lambda crashed, but that information is not always available. As far as I know, disassembly is one of the few ways to determine what crashed from this.
The ideas I've had to make this better are:
Using a tool like addr2line if the platform supports it. This sometimes works, sometimes not.
Wrapping up all lambdas in functors (not ideal, to say the least).
Not using lambdas (again, not ideal).
Using a compiler extension to give the lambda a more meaningful name / add debugging info.
The 4th option sounded promising, so I did some investigation, but couldn't find anything. In case it matters, the compilers I have available are clang++ 5.0 and MSVC 19 (Visual Studio 2015).
My question is, what other tools / techniques are available that can help map a callstack with a lambda function in it to the corresponding source line?

I am afraid it is not possible. You should design your own technique how to store required information in lamdas. Your option 2 is suitable here. You may look how does it Google: https://cs.chromium.org/chromium/src/base/task_scheduler/post_task.h
Below is very raw approach (https://ideone.com/OFCgAq)
#include <iostream>
#include <stack>
#include <functional>
std::stack<std::function<void(void)>> someJobQueue;
template <typename T>
void submitJobToPool(std::string from_here, T callable) {
someJobQueue.push(std::bind([callable](std::string from_here) { callable(); }, from_here));
}
void runJobFromPool() {
auto job = someJobQueue.top();
someJobQueue.pop();
job();
}
int main() {
submitJobToPool(__func__, [](){ std::cout << "It's me." << std::endl; });
runJobFromPool();
return 0;
}
Unfortunately you will not see a perfect call stack. But you can see from_here in a debugger.
void lambda_1a7009ccf8810b62b59083b4c1779e56() 0x345678920
void lambda_a7009ccf8810b62b59083b4c1779e569() 0x345678910 <-- Here `from_here` will be available: "main"
void runJobFromPool() 0x234567890
int main(int, char**) 0x123456780

One technique is to create a struct (ideally via a script) that has multiple functions (one for each lambda) and call the lambda via this struct strictly. These functions could just take this lambda and execute it.
This struct method will show up in the crash logs and the stack trace.
struct LambdaCrashLogs {
Lambda1(std::function<void(void)> job) {
job();
}
};
The main approach will be to write a pre-commit script that generates this struct automatically.

Bad practice to call static function from external file via function pointer?

Consider the following code:
file_1.hpp:
typedef void (*func_ptr)(void);
func_ptr file1_get_function(void);
file1.cpp:
// file_1.cpp
#include "file_1.hpp"
static void some_func(void)
{
do_stuff();
}
func_ptr file1_get_function(void)
{
return some_func;
}
file2.cpp
#include "file1.hpp"
void file2_func(void)
{
func_ptr function_pointer_to_file1 = file1_get_function();
function_pointer_to_file1();
}
While I believe the above example is technically possible - to call a function with internal linkage only via a function pointer, is it bad practice to do so? Could there be some funky compiler optimizations that take place (auto inline, for instance) that would make this situation problematic?

There's no problem, this is fine. In fact , IMHO, it is a good practice which lets your function be called without polluting the space of externally visible symbols.
It would also be appropriate to use this technique in the context of a function lookup table, e.g. a calculator which passes in a string representing an operator name, and expects back a function pointer to the function for doing that operation.
The compiler/linker isn't allowed to make optimizations which break correct code and this is correct code.
Historical note: back in C89, externally visible symbols had to be unique on the first 6 characters; this was relaxed in C99 and also commonly by compiler extension.

In order for this to work, you have to expose some portion of it as external and that's the clue most compilers will need.
Is there a chance that there's a broken compiler out there that will make mincemeat of this strange practice because they didn't foresee someone doing it? I can't answer that.
I can only think of false reasons to want to do this though: Finger print hiding, which fails because you have to expose it in the function pointer decl, unless you are planning to cast your way around things, in which case the question is "how badly is this going to hurt".
The other reason would be facading callbacks - you have some super-sensitive static local function in module m and you now want to expose the functionality in another module for callback purposes, but you want to audit that so you want a facade:
static void voodoo_function() {
}
fnptr get_voodoo_function(const char* file, int line) {
// you tagged the question as C++, so C++ io it is.
std::cout << "requested voodoo function from " << file << ":" << line << "\n";
return voodoo_function;
}
...
// question tagged as c++, so I'm using c++ syntax
auto* fn = get_voodoo_function(__FILE__, __LINE__);
but that's not really helping much, you really want a wrapper around execution of the function.
At the end of the day, there is a much simpler way to expose a function pointer. Provide an accessor function.
static void voodoo_function() {}
void do_voodoo_function() {
// provide external access to voodoo
voodoo_function();
}
Because here you provide the compiler with an optimization opportunity - when you link, if you specify whole program optimization, it can detect that this is a facade that it can eliminate, because you let it worry about function pointers.
But is there a really compelling reason not just to remove the static from infront of voodoo_function other than not exposing the internal name for it? And if so, why is the internal name so precious that you would go to these lengths to hide that?
static void ban_account_if_user_is_ugly() {
...;
}
fnptr do_that_thing() {
ban_account_if_user_is_ugly();
}
vs
void do_that_thing() { // ban account if user is ugly
...
}
--- EDIT ---
Conversion. Your function pointer is int(*)(int) but your static function is unsigned int(*)(unsigned int) and you don't want to have to cast it.
Again: Just providing a facade function would solve the problem, and it will transform into a function pointer later. Converting it to a function pointer by hand can only be a stumbling block for the compiler's whole program optimization.
But if you're casting, lets consider this:
// v1
fnptr get_fn_ptr() {
// brute force cast because otherwise it's 'hassle'
return (fnptr)(static_fn);
}
int facade_fn(int i) {
auto ui = static_cast<unsigned int>(i);
auto result = static_fn(ui);
return static_cast<int>(result);
}
Ok unsigned to signed, not a big deal. And then someone comes along and changes what fnptr needs to be to void(int, float);. One of the above becomes a weird runtime crash and one becomes a compile error.

Stop goto optimization g

When writing asm code, there is a trick to slow down the code by a cycle or two by telling the cpu to explicitly jump to the next instruction. I was thinking to do something similar using C++ templates. Here's my code:
template <unsigned int c>
inline void adelay()
{
goto x;
x:
adelay<c-1>();
}
template <>
inline void adelay<0>()
{
}
Although the idea seems sound, the optimizer appears to be getting in the way by removing the jmp code. Any ideas how this could be implemented?
Background
The reason for wanting to do this is to slow down the code of a micro-controller such that it outputs a light beam pulse at a very specific frequency. This is a very specialized use, and is not a common except in low level hardware access such as writing drivers or programming micro-controllers. Even then I try and avoid such things when at all possible. Unfortunately, this cannot always be avoided.

That's what optimizer should do - optimize, including removal of non-functional code.
Either disable the optimizations completely in your compiler options or use other methods to slow your program, there are plenty of APIs that allow you to sleep for a defined time.

You can add this attribute:
template <>
inline void __attribute__((optimize("O0"))) adelay<0>()
{
}
Which should prevent the optimization. Although as others have mentioned there are probably better ways but if this is purely for learning purposes than all good. I usually use this to verify assembler output really quick or when I am not at a command line.

Thanks for the help all. Instead of using jmp instructions, I went with nop instructions:
template <unsigned int c>
inline void adelay()
{
asm("nop");
adelay<c-1>();
}
template <>
inline void adelay<0>()
{
}
At one point I used referencing a volatile variable which worked at a slightly courser granularity:
static volatile int _adelay = 0;
template <unsigned int c>
inline void adelay()
{
_adelay;
adelay<c-1>();
}
template <>
inline void adelay<0>()
{
}
That may be useful when I'm running low on memory.
Thanks again! :)
Adrian

How to check if a function exists in C/C++?

Certain situations in my code, I end up invoking the function only if that function is defined, or else I should not. How can I achieve this?
like:
if (function 'sum' exists ) then invoke sum ()
Maybe the other way around to ask this question is how to determine if function is defined at runtime and if so, then invoke?

When you declare 'sum' you could declare it like:
#define SUM_EXISTS
int sum(std::vector<int>& addMeUp) {
...
}
Then when you come to use it you could go:
#ifdef SUM_EXISTS
int result = sum(x);
...
#endif
I'm guessing you're coming from a scripting language where things are all done at runtime. The main thing to remember with C++ is the two phases:
Compile time
Preprocessor runs
template code is turned into real source code
source code is turned in machine code
runtime
the machine code is run
So all the #define and things like that happen at compile time.
....
If you really wanted to do it all at runtime .. you might be interested in using some of the component architecture products out there.
Or maybe a plugin kind of architecture is what you're after.

Using GCC you can:
void func(int argc, char *argv[]) __attribute__((weak)); // weak declaration must always be present
// optional definition:
/*void func(int argc, char *argv[]) {
printf("FOUND THE FUNCTION\n");
for(int aa = 0; aa < argc; aa++){
printf("arg %d = %s \n", aa, argv[aa]);
}
}*/
int main(int argc, char *argv[]) {
if (func){
func(argc, argv);
} else {
printf("did not find the function\n");
}
}
If you uncomment func it will run it otherwise it will print "did not find the function\n".

While other replies are helpful advices (dlsym, function pointers, ...), you cannot compile C++ code referring to a function which does not exist. At minimum, the function has to be declared; if it is not, your code won't compile. If nothing (a compilation unit, some object file, some library) defines the function, the linker would complain (unless it is weak, see below).
But you should really explain why you are asking that. I can't guess, and there is some way to achieve your unstated goal.
Notice that dlsym often requires functions without name mangling, i.e. declared as extern "C".
If coding on Linux with GCC, you might also use the weak function attribute in declarations. The linker would then set undefined weak symbols to null.
addenda
If you are getting the function name from some input, you should be aware that only a subset of functions should be callable that way (if you call an arbitrary function without care, it will crash!) and you'll better explicitly construct that subset. You could then use a std::map, or dlsym (with each function in the subset declared extern "C"). Notice that dlopen with a NULL path gives a handle to the main program, which you should link with -rdynamic to have it work correctly.
You really want to call by their name only a suitably defined subset of functions. For instance, you probably don't want to call this way abort, exit, or fork.
NB. If you know dynamically the signature of the called function, you might want to use libffi to call it.

I suspect that the poster was actually looking for something more along the lines of SFINAE checking/dispatch. With C++ templates, can define to template functions, one which calls the desired function (if it exists) and one that does nothing (if the function does not exist). You can then make the first template depend on the desired function, such that the template is ill-formed when the function does not exist. This is valid because in C++ template substitution failure is not an error (SFINAE), so the compiler will just fall back to the second case (which for instance could do nothing).
See here for an excellent example: Is it possible to write a template to check for a function's existence?

use pointers to functions.
//initialize
typedef void (*PF)();
std::map<std::string, PF> defined_functions;
defined_functions["foo"]=&foo;
defined_functions["bar"]=&bar;
//if defined, invoke it
if(defined_functions.find("foo") != defined_functions.end())
{
defined_functions["foo"]();
}

If you know what library the function you'd like to call is in, then you can use dlsym() and dlerror() to find out whether or not it's there, and what the pointer to the function is.
Edit: I probably wouldn't actually use this approach - instead I would recommend Matiu's solution, as I think it's much better practice. However, dlsym() isn't very well known, so I thought I'd point it out.

You can use #pragma weak for the compilers that support it (see the weak symbol wikipedia entry).
This example and comment is from The Inside Story on Shared Libraries and Dynamic Loading:
#pragma weak debug
extern void debug(void);
void (*debugfunc)(void) = debug;
int main() {
printf(“Hello World\n”);
if (debugfunc) (*debugfunc)();
}
you can use the weak pragma to force the linker to ignore unresolved
symbols [..] the program compiles and links whether or not debug()
is actually defined in any object file. When the symbol remains
undefined, the linker usually replaces its value with 0. So, this
technique can be a useful way for a program to invoke optional code
that does not require recompiling the entire application.

So another way, if you're using c++11 would be to use functors:
You'll need to put this at the start of your file:
#include <functional>
The type of a functor is declared in this format:
std::function< return_type (param1_type, param2_type) >
You could add a variable that holds a functor for sum like this:
std::function<int(const std::vector<int>&)> sum;
To make things easy, let shorten the param type:
using Numbers = const std::vectorn<int>&;
Then you could fill in the functor var with any one of:
A lambda:
sum = [](Numbers x) { return std::accumulate(x.cbegin(), x.cend(), 0); } // std::accumulate comes from #include <numeric>
A function pointer:
int myFunc(Numbers nums) {
int result = 0;
for (int i : nums)
result += i;
return result;
}
sum = &myFunc;
Something that 'bind' has created:
struct Adder {
int startNumber = 6;
int doAdding(Numbers nums) {
int result = 0;
for (int i : nums)
result += i;
return result;
}
};
...
Adder myAdder{2}; // Make an adder that starts at two
sum = std::bind(&Adder::doAdding, myAdder);
Then finally to use it, it's a simple if statement:
if (sum)
return sum(x);
In summary, functors are the new pointer to a function, however they're more versatile. May actually be inlined if the compiler is sure enough, but generally are the same as a function pointer.
When combined with std::bind and lambda's they're quite superior to old style C function pointers.
But remember they work in c++11 and above environments. (Not in C or C++03).

In C++, a modified version of the trick for checking if a member exists should give you what you're looking for, at compile time instead of runtime:
#include <iostream>
#include <type_traits>
namespace
{
template <class T, template <class...> class Test>
struct exists
{
template<class U>
static std::true_type check(Test<U>*);
template<class U>
static std::false_type check(...);
static constexpr bool value = decltype(check<T>(0))::value;
};
template<class U, class = decltype(sum(std::declval<U>(), std::declval<U>()))>
struct sum_test{};
template <class T>
void validate_sum()
{
if constexpr (exists<T, sum_test>::value)
{
std::cout << "sum exists for type " << typeid(T).name() << '\n';
}
else
{
std::cout << "sum does not exist for type " << typeid(T).name() << '\n';
}
}
class A {};
class B {};
void sum(const A& l, const A& r); // we only need to declare the function, not define it
}
int main(int, const char**)
{
validate_sum<A>();
validate_sum<B>();
}
Here's the output using clang:
sum exists for type N12_GLOBAL__N_11AE
sum does not exist for type N12_GLOBAL__N_11BE
I should point out that weird things happened when I used an int instead of A (sum() has to be declared before sum_test for the exists to work, so maybe exists isn't the right name for this). Some kind of template expansion that didn't seem to cause problems when I used A. Gonna guess it's ADL-related.

This answer is for global functions, as a complement to the other answers on testing methods. This answer only applies to global functions.
First, provide a fallback dummy function in a separate namespace. Then determine the return type of the function-call, inside a template parameter. According to the return-type, determine if this is the fallback function or the wanted function.
If you are forbidden to add anything in the namespace of the function, such as the case for std::, then you should use ADL to find the right function in the test.
For example, std::reduce() is part of c++17, but early gcc compilers, which should support c++17, don't define std::reduce(). The following code can detect at compile-time whether or not std::reduce is declared. See it work correctly in both cases, in compile explorer.
#include <numeric>
namespace fallback
{
// fallback
std::false_type reduce(...) { return {}; }
// Depending on
// std::recuce(Iter from, Iter to) -> decltype(*from)
// we know that a call to std::reduce(T*, T*) returns T
template <typename T, typename Ret = decltype(reduce(std::declval<T*>(), std::declval<T*>()))>
using return_of_reduce = Ret;
// Note that due to ADL, std::reduce is called although we don't explicitly call std::reduce().
// This is critical, since we are not allowed to define any of the above inside std::
}
using has_reduce = fallback::return_of_reduce<std::true_type>;
// using has_sum = std::conditional_t<std::is_same_v<fallback::return_of_sum<std::true_type>,
// std::false_type>,
// std::false_type,
// std::true_type>;
#include <iterator>
int main()
{
if constexpr (has_reduce::value)
{
// must have those, so that the compile will find the fallback
// function if the correct one is undefined (even if it never
// generates this code).
using namespace std;
using namespace fallback;
int values[] = {1,2,3};
return reduce(std::begin(values), std::end(values));
}
return -1;
}
In cases, unlike the above example, when you can't control the return-type, you can use other methods, such as std::is_same and std::contitional.
For example, assume you want to test if function int sum(int, int) is declared in the current compilation unit. Create, in a similar fashion, test_sum_ns::return_of_sum. If the function exists, it will be int and std::false_type otherwise (or any other special type you like).
using has_sum = std::conditional_t<std::is_same_v<test_sum_ns::return_of_sum,
std::false_type>,
std::false_type,
std::true_type>;
Then you can use that type:
if constexpr (has_sum::value)
{
int result;
{
using namespace fallback; // limit this only to the call, if possible.
result = sum(1,2);
}
std::cout << "sum(1,2) = " << result << '\n';
}
NOTE: You must have to have using namespace, otherwise the compiler will not find the fallback function inside the if constexpr and will complain. In general, you should avoid using namespace since future changes in the symbols inside the namespace may break your code. In this case there is no other way around it, so at least limit it to the smallest scope possible, as in the above example

C++ pimpl idiom wastes an instruction vs. C style?

(Yes, I know that one machine instruction usually doesn't matter. I'm asking this question because I want to understand the pimpl idiom, and use it in the best possible way; and because sometimes I do care about one machine instruction.)
In the sample code below, there are two classes, Thing and
OtherThing. Users would include "thing.hh".
Thing uses the pimpl idiom to hide it's implementation.
OtherThing uses a C style – non-member functions that return and take
pointers. This style produces slightly better machine code. I'm
wondering: is there a way to use C++ style – ie, make the functions
into member functions – and yet still save the machine instruction. I like this style because it doesn't pollute the namespace outside the class.
Note: I'm only looking at calling member functions (in this case, calc). I'm not looking at object allocation.
Below are the files, commands, and the machine code, on my Mac.
thing.hh:
class ThingImpl;
class Thing
{
ThingImpl *impl;
public:
Thing();
int calc();
};
class OtherThing;
OtherThing *make_other();
int calc(OtherThing *);
thing.cc:
#include "thing.hh"
struct ThingImpl
{
int x;
};
Thing::Thing()
{
impl = new ThingImpl;
impl->x = 5;
}
int Thing::calc()
{
return impl->x + 1;
}
struct OtherThing
{
int x;
};
OtherThing *make_other()
{
OtherThing *t = new OtherThing;
t->x = 5;
}
int calc(OtherThing *t)
{
return t->x + 1;
}
main.cc (just to test the code actually works...)
#include "thing.hh"
#include <cstdio>
int main()
{
Thing *t = new Thing;
printf("calc: %d\n", t->calc());
OtherThing *t2 = make_other();
printf("calc: %d\n", calc(t2));
}
Makefile:
all: main
thing.o : thing.cc thing.hh
g++ -fomit-frame-pointer -O2 -c thing.cc
main.o : main.cc thing.hh
g++ -fomit-frame-pointer -O2 -c main.cc
main: main.o thing.o
g++ -O2 -o $# $^
clean:
rm *.o
rm main
Run make and then look at the machine code. On the mac I use otool -tv thing.o | c++filt. On linux I think it's objdump -d thing.o. Here is the relevant output:
Thing::calc():
0000000000000000 movq (%rdi),%rax
0000000000000003 movl (%rax),%eax
0000000000000005 incl %eax
0000000000000007 ret
calc(OtherThing*):
0000000000000010 movl (%rdi),%eax
0000000000000012 incl %eax
0000000000000014 ret
Notice the extra instruction because of the pointer indirection. The first function looks up two fields (impl, then x), while the second only needs to get x. What can be done?

One instruction is rarely a thing to spend much time worrying over. Firstly, the compiler may cache the pImpl in a more complex use case, thus amortising the cost in a real-world scenario. Secondly, pipelined architectures make it almost impossible to predict the real cost in clock cycles. You'll get a much more realistic idea of the cost if you run these operations in a loop and time the difference.

Not too hard, just use the same technique inside your class. Any halfway decent optimizer will inline
the trivial wrapper.
class ThingImpl;
class Thing
{
ThingImpl *impl;
static int calc(ThingImpl*);
public:
Thing();
int calc() { calc(impl); }
};

There's the nasty way, which is to replace the pointer to ThingImpl with a big-enough array of unsigned chars and then placement/new reinterpret cast/explicitly destruct the ThingImpl object.
Or you could just pass the Thing around by value, since it should be no larger than the pointer to the ThingImpl, though may require a little more than that (reference counting of the ThingImpl would defeat the optimisation, so you need some way of flagging the 'owning' Thing, which might require extra space on some architectures).

I disagree about your usage: you are not comparing the 2 same things.
#include "thing.hh"
#include <cstdio>
int main()
{
Thing *t = new Thing; // 1
printf("calc: %d\n", t->calc());
OtherThing *t2 = make_other(); // 2
printf("calc: %d\n", calc(t2));
}
You have in fact 2 calls to new here, one is explicit and the other is implicit (done by the constructor of Thing.
You have 1 new here, implicit (inside 2)
You should allocate Thing on the stack, though it would not probably change the double dereferencing instruction... but could change its cost (remove a cache miss).
However the main point is that Thing manages its memory on its own, so you can't forget to delete the actual memory, while you definitely can with the C-style method.
I would argue that automatic memory handling is worth an extra memory instruction, specifically because as it's been said, the dereferenced value will probably be cached if you access it more than once, thus amounting to almost nothing.
Correctness is more important than performance.

Let the compiler worry about it. It knows far more about what is actually faster or slower than we do. Especially on such a minute scale.
Having items in classes has far, far more benefits than just encapsulation. PIMPL's a great idea, if you've forgotten how to use the private keyword.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Does the (gcc) compiler optimize away empty-body functions? - c++

Related

Given a call stack containing a lambda function, how can one determine its source?

Bad practice to call static function from external file via function pointer?

Stop goto optimization g

How to check if a function exists in C/C++?

C++ pimpl idiom wastes an instruction vs. C style?

Categories

Resources