Related
In this talk (sorry about the sound) Chandler Carruth suggests not passing by reference, even const reference, in the vast majority of cases due to the way in which it limits the back-end to perform optimisation.
He claims that in most cases the copy is negligible - which I am happy to believe, most data structures/classes etc. have a very small part allocated on the stack - especially when compared with the back-end having to assume pointer aliasing and all the nasty things that could be done to a reference type.
Let's say that we have large object on the stack - say ~4kB and a function that does something to an instance of this object (assume free-standing function).
Classically I would write:
void DoSomething(ExpensiveType* inOut);
ExpensiveType data;
...
DoSomething(&data);
He's suggesting:
ExpensiveType DoSomething(ExpensiveType in);
ExpensiveType data;
...
data = DoSomething(data);
According to what I got from the talk, the second would tend to optimise better. Is there a limit to how big I make something like this though, or is the back-end copy-elision stuff just going to prefer the values in almost all cases?
EDIT: To clarify I'm interested in the whole system, since I feel that this would be a major change to the way I write code, I've had use of refs over values drilled into me for anything larger than integral types for a long time now.
EDIT2: I tested it too, results and code here. No competition really, as we've been taught for a long time, the pointer is a far faster method of doing things. What intrigues me now is why it was suggested during that talk that we move to pass by value, but as the numbers don't support it, it's not something I'm going to do.
I have now watched parts of Chandler's talk. I think the general discussion along the lines "should I now always pass by value" does not do his talk justice. Edit: And actually his talk has been discussed before, here value semantics vs output params with large data structures and in a blog from Eric Niebler, http://ericniebler.com/2013/10/13/out-parameters-vs-move-semantics/.
Back to Chandler. In the key note he specifically (around the 4x-5x minute mark mentioned elsewhere) mentions the following points:
If the optimizer cannot see the code of the called function you have much bigger problems than passing refs or values. It pretty much prevents optimization. (There is a follow-up question at that point about link time optimization which may be discussed later, I don't know.)
He recommends the "new classical" way of returning values using move semantics. Instead of the old school way of passing a reference to an existing object as an in-out parameter the value should be constructed locally and moved out. The big advantage is that the optimizer can be sure that no part of the object is alisased since only the function has access to it.
He mentions threads, storing a variable's value in globals, and observable behaviour like output as examples for unknowns which prevent optimization when only refs/pointers are passed. I think an abstract description could be "the local code can not assume that local value changes are undetected elsewhere, and it cannot assume that a value which is not changed locally has not changed at all". With local copies these assumptions could be made.
Obviously, when passing (and possibly, if objects cannot be moved, when returning) by value, there is a trade-off between the copy cost and the optimization benefits. Size and other things making copying costly will tip the balance towards reference strategies, while lots of optimizable work on the object in the function tips it towards value passing. (His examples involved pointers to ints, not to 4k sized objects.)
Based on the parts I watched I do not think Chandler promoted passing by value as a one-fits-all strategy. I think he dissed passing by reference mostly in the context of passing an out parameter instead of returning a new object. His example was not about a function which modified an existing object.
On a general note:
A program should express the programmer's intent. If you need a copy, by all means do copy! If you want to modify an existing object, by all means use references or pointers. Only if side effects or run time behavior become unbearable; really only then try do do something smart.
One should also be aware that compiler optimizations are sometimes surprising. Other platforms, compilers, compiling options, class libraries or even just small changes in your own code may all prevent the compiler from coming to the rescue. The run-time cost of the change would in many cases come totally unexpected.
Perhaps you took that part of the talk out of context, or something. For large objects, typically it depends on whether the function needs a copy of the object or not. For example:
ExpensiveType DoSomething(ExpensiveType in)
{
cout << in.member;
}
you wasted a lot of resource copying the object unnecessarily, when you could have passed by const reference instead.
But if the function is:
ExpensiveType DoSomething(ExpensiveType in)
{
in.member = 5;
do_something_else(in);
}
and we did not want to modify the calling function's object, then this code is likely to be more efficient than:
ExpensiveType DoSomething(ExpensiveType const &inr)
{
ExpensiveType in = inr;
in.member = 5;
do_something_else(in);
}
The difference comes when invoked with an rvalue (e.g. DoSomething( ExpensiveType(6) ); The latter creates a temporary , makes a copy, then destroys both; whereas the former will create a temporary and use that to move-construct in. (I think this can even undergo copy elision).
NB. Don't use pointers as a hack to implement pass-by-reference. C++ has native pass by reference.
I tend do use the following notation a lot:
const Datum& d = a.calc();
// continue to use d
This works when the result of calc is on the stack, see http://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-the-most-important-const/. Even though compilers probably optimize here, it feels nice to explicitly avoid temporary objects.
Today I realized, that the content of d became invalid after data was written to a member of a. In this particular case the get function simply returned a reference to another member, which was completely unrelated to the write however. Like this:
const Datum& d = a.get();
// ... some operation ...
a.somedata = Datum2();
// now d is invalid.
Again, somedata has nothing to do with d or get() here.
Now I ask myself:
Which side-effects could lead to the invalidation?
Is it bad-practice to assign return values to a const reference? (Especially when I don't know the interior of the function)
My application is single-threaded except for a Qt GUI-Thread.
You seem to be afraid of elision failing. That auto x = some_func(); will result in an extra move construct over auto&& x = some_func(); when some_func() returns a temporary object.
You shouldn't be.
If elision fails, it means your compiler is incompetent, or compiling with frankly hostile settings. And you cannot survive hostile settings or incompetent compilers: incompetent compilers can turn a+=b with integers into for (int i = 0; i < abs(b); ++i) {if (b>0) ++a; else --a;} and violate not one iota of the standard.
Elision is a core language feature. Don't write bad code just because you don't trust it will happen.
You should capture by reference when you want a reference to the data provided by the function, and not an independently stable copy of the data. If you don't understand the function's return value's lifetime, capturing by reference is simply not safe.
Even if you know that the data will be stable, more time is spent maintaining code than writing it: the person reading your code must be able to see at a glance that your assumption holds. And non-local bugs are bad: a seemingly innocuous change to the function you call should not break your code.
The end result is, take things by value unless you have a good reason not to.
Taking things by value makes it easier for you, and the compiler, to reason about your code. It increases locality.
When you have a good reason not to, then take things by reference.
Depending on context, this good reason might not have to be very strong. But it shouldn't be based off an assumption of an incompetent compiler.
Premature pessimization should be avoided, but so should premature optimization. Taking (or storing) references instead of values should be something you do when and if you have identified performance problems. Write clean, easy to understand code. Shove complexity into tightly written types, and have the exterior interface be clean and simple. Take things by value, because values decouple state.
Optimization is fungible. By making more of your code simpler, you can make it easier to work with (and be more productive). Then, when you identify parts where performance matters, you can expend effort there to make the code faster.
A big example is foundational code: foundational code (which is used everywhere) quickly becomes a general drag on performance if not written with performance and ease of use in mind. In that case, you want to hide complexity inside the type, and expose a simple easy to use exterior interface that doesn't require the user to understand the guts.
But code in a random function somewhere? Use values, the easiest to use containers with the friendliest O-notation to the most expensive operation you do and the easiest interface. Vectors if reasonable (avoid premature pessmiziation), but don't sweat a few maps.
Find the 1%-10% of your code that takes up 90%-99% of the time, and make that fast. If the rest of your code has good O-notation performance (so it won't become shockingly slower with larger data sets than you test with), you'll be in good shape. Then start testing with ridiculous data-sets, and find the slow parts then.
Which side-effects could lead to the invalidation?
Holding a reference to the internal state of a class (i.e., versus extending the lifetime of a temporary) and then calling any non-const member functions may potentially invalidate the reference.
Is it bad-practice to assign return values to a const reference? (Especially when I don't know the interior of the function)
I would say the bad practice is to hold a reference to the internal state of a class instance, mutating the class instance, and continuing to use the original reference (unless it is documented that the non-const functions don't invalidate the reference)
I am not sure that I am answering exactly what you expected to hear but...
const keyword has nothing to do with the 'unsafe' here.
Even if you return non const reference, it may become invalid. const means that it is not allowed to modify it. For example if your get() returns a const member or get() itself is defined as const like this const some_refetence_t& get() const { return m_class_member; }.
Now regarding your questions at the end:
Which side-effects could lead to the invalidation?
There could be many side effects if the original value changes. For example let's assume that the returned value is a reference to an object on the heap which is deleted... or the returned value is cached while the original value gets updates. All these are design problems. If by design such case can take place then the return value should be by value (and in case of caching, it must not be cached! :) ).
Is it bad-practice to assign return values to a const reference?
(Especially when I don't know the interior of the function)
Same thing. If by design you don't have to modify the object that you get (by reference or by value) than it is the best to define it as const. Once you define something as const, the compiler will make sure that you are not trying to modify it somehow in the code.
It is rather question of knowing what your functions return. You should also have a full understanding of the return value type and it's semantics.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
Basing on what I've gathered from compiler writers value types are much preferred to references/pointers in terms of efficiency.
This comes from a fact that a values types are easier to reason about when you don't have to care about aliasing, externally changed memory (which the pointer refers to), cost of pointer dereference, and such things. I have to say that while I understand such concerns I still have a few questions regarding specific cases.
Case #0
void foo(const float& f)
Okay, we have a reference here, but it's constant! Sure we have a constant view (ref) of it, so externally it might be change, but it could only happen in multithreaded world and I am not sure if compiler has to take it into consideration at all if there are no synchronization primitives used. Obviously if internally we used another pointer/reference to any float variable we might be at risk of modifying the f parameter. Can compiler treat this parameter as safe (assuming we don't use any ref/ptr to float internally)?
Case #1
void foo(vector<int> f)
Talking from a C++11/14 programmer perspective I know that the vector can be safely moved into the function. As we all know, internally the container holds a pointer to an array. Will the compiler treat the pointer as safe (no externally modifications) as we just got a copy of vector, so we are the only owners of it?
In other words: is a copied object treated as a safe one (because logically we make a clone of the object), or the compiler is not allowed to make such assumptions and any ptr/ref member has to be treated as potentially dangerous as the copy ctor/op might not have made a proper copy. Shouldn't the programmer be responsible for handling shared resources when copying them?
Bottomline:
Do constant pointers/references and copied complex objects are generally slower than copies of primitives, and thus should be avoided as much as possible in performance critical code; or they are only slightly less efficient and we shouldn't fret about it?
As general rules in modern C++:
For (cheap to copy) primitive types, like int, float or double, etc., if it's an input (read-only) parameter, just pass by value:
inline double Square(double x)
{
return x*x;
}
Instead, if the type of the input parameter is not cheap to copy, but it's cheap to move, e.g. std::vector, std::string, etc., consider two sub-cases:
a. If the function is just observing the value, then pass by const reference (this will prevent useless potentially expensive deep-copies):
double Average(const std::vector<double> & v)
{
.... // compute average of elements in v (which is read-only)
}
b. If the function is taking a local copy of the parameter, then pass by value, and std::move from the value (this will allow optimizations thanks to move semantics):
void Person::SetName(std::string name)
{
m_name = std::move(name);
}
(Started as a comment but it wouldn't fit.)
Case #0 has already been discussed to death, for example:
Is it counter-productive to pass primitive types by reference?
which is already a duplicate of two other questions. In particular, I find this answer a good answer to your case #0 as well. Related questions:
Is it better in C++ to pass by value or pass by constant reference?
Reasons to not pass simple types by reference?
is there any specific case where pass-by-value is preferred over pass-by-const-reference in C++?
"const T &arg" vs. "T arg"
How to pass objects to functions in C++?
Case #1 is unclear to me: Do you need a copy or do you want to move? There is an enormous difference between the two and it is unclear from what you write which one you need.
If a reference suffices but you do a copy instead, you are wasting resources.
If you need to make a deep copy then that's all there is to it, neither references nor moving will help.
Please read this answer and revise case #1.
Case #0
No - It may be externally modified:
void foo(const float& f) {
..use f..
..call a function which is not pure..
..use (and reload) f..
}
Case #1 ... Will the compiler treat the pointer as safe (no externally modifications) as we just got a copy of vector, so we are the only owners of it?
No - it must be pessimistic. It could be taught to rely on an implementation but in general, it has no reasonable way of tracking that pointer through all possible construction scenarios for arbitrary construction to verify it is safe, even if the implementations were visible.
Bottomline:
Cost of allocation and copying containers tend to be much greater than the cost of the loads and stores -- depends on your program, hardware, and implementation!
Passing small objects and builtins by reference doesn't mean an optimizer must treat it as a reference when the implementation is visible. E.g. If it sees the caller is passing a constant, it has the liberty to make the optimization based on the known constant value. Conversely, creating a copy can interfere with the ability to optimize your program since complexity can increase. Fretting over whether or not to pass this trivial/small type by value is an old micro-optimization. Copying a (non-SSO) string or vector OTOH can be huge in comparison. Focus on the semantics first.
I write tons of performance critical code and pass almost everything by (appropriately const-qualified) reference -- including builtins. You're counting instructions and speed of memory at that point (for your parameters), which is very low in desktop and portable computers. I did plenty of testing on desktops and notebooks before settling on that. I do that for uniformness - you don't need to worry about the cost of introducing the reference (where overhead exists) outside embedded. Again, the cost to make unnecessary copies and any necessary dynamic allocations tend to be far greater. Also consider that objects have additional construction, copy, and destruction functions to execute -- even innocent looking types can cost much more to copy than to reference.
Which of the following examples is the better way of declaring the following function and why?
void myFunction (const int &myArgument);
or
void myFunction (int myArgument);
Use const T & arg if sizeof(T)>sizeof(void*) and use T arg if sizeof(T) <= sizeof(void*)
They do different things. const T& makes the function take a reference to the variable. On the other hand, T arg will call the copy constructor of the object and passes the copy.
If the copy constructor is not accessible (e.g. it's private), T arg won't work:
class Demo {
public: Demo() {}
private: Demo(const Demo& t) { }
};
void foo(Demo t) { }
int main() {
Demo t;
foo(t); // error: cannot copy `t`.
return 0;
}
For small values like primitive types (where all matters is the contents of the object, not the actual referential identity; say, it's not a handle or something), T arg is generally preferred. For large objects and objects that you can't copy and/or preserving referential identity is important (regardless of the size), passing the reference is preferred.
Another advantage of T arg is that since it's a copy, the callee cannot maliciously alter the original value. It can freely mutate the variable like any local variables to do its work.
Taken from Move constructors. I like the easy rules
If the function intends to change the argument as a side effect, take it by reference/pointer to a non-const object. Example:
void Transmogrify(Widget& toChange);
void Increment(int* pToBump);
If the function doesn't modify its argument and the argument is of primitive type, take it by value. Example:
double Cube(double value);
Otherwise
3.1. If the function always makes a copy of its argument inside, take it by value.
3.2. If the function never makes a copy of its argument, take it by reference to const.
3.3. Added by me: If the function sometimes makes a copy, then decide on gut feeling: If the copy is done almost always, then take by value. If the copy is done half of the time, go the safe way and take by reference to const.
In your case, you should take the int by value, because you don't intend to modify the argument, and the argument is of primitive type. I think of "primitive type" as either a non-class type or a type without a user defined copy constructor and where sizeof(T) is only a couple of bytes.
There's a popular advice that states that the method of passing ("by value" vs "by const reference") should be chosen depending in the actual size of the type you are going to pass. Even in this discussion you have an answer labeled as "correct" that suggests exactly that.
In reality, basing your decision on the size of the type is not only incorrect, this is a major and rather blatant design error, revealing a serious lack of intuition/understanding of good programming practices.
Decisions based on the actual implementation-dependent physical sizes of the objects must be left to the compiler as often as possible. Trying to "tailor" your code to these sizes by hard-coding the passing method is a completely counterproductive waste of effort in 99 cases out of 100. (Yes, it is true, that in case of C++ language, the compiler doesn't have enough freedom to use these methods interchangeably - they are not really interchangeable in C++ in general case. Although, if necessary, a proper size-based [semi-]automatic passing methios selection might be implemented through template metaprogramming; but that's a different story).
The much more meaningful criterion for selecting the passing method when you write the code "by hand" might sound as follows:
Prefer to pass "by value" when you are passing an atomic, unitary, indivisible entity, such as a single non-aggregate value of any type - a number, a pointer, an iterator. Note that, for example, iterators are unitary values at the logical level. So, prefer to pass iterators by value, regardless of whether their actual size is greater than sizeof(void*). (STL implementation does exactly that, BTW).
Prefer to pass "by const reference" when you are passing an aggregate, compound value of any kind. i.e. a value that has exposed pronouncedly "compound" nature at the logical level, even if its size is no greater than sizeof(void*).
The separation between the two is not always clear, but that how things always are with all such recommendations. Moreover, the separation into "atomic" and "compound" entities might depend on the specifics of your design, so the decision might actually differ from one design to the other.
Note, that this rule might produce decisions different from those of the allegedly "correct" size-based method mentioned in this discussion.
As an example, it is interesing to observe, that the size-based method will suggest you manually hard-code different passing methods for different kinds of iterators, depending on their physical size. This makes is especially obvious how bogus the size-based method is.
Once again, one of the basic principles from which good programming practices derive, is to avoid basing your decisions on physical characteristics of the platform (as much as possible). Instead, you decisions have to be based on the logical and conceptual properties of the entities in your program (as much as possible). The issue of passing "by value" or "by reference" is no exception here.
In C++11 introduction of move semantics into the language produced a notable shift in the relative priorities of different parameter-passing methods. Under certain circumstances it might become perfectly feasible to pass even complex objects by value
Should all/most setter functions in C++11 be written as function templates accepting universal references?
Contrary to popular and long-held beliefs, passing by const reference isn't necessarily faster even when you're passing a large object. You might want to read Dave Abrahams recent article on this very subject.
Edit: (mostly in response to Jeff Hardy's comments): It's true that passing by const reference is probably the "safest" alternative under the largest number of circumstances -- but that doesn't mean it's always the best thing to do. But, to understand what's being discussed here, you really do need to read Dave's entire article quite carefully, as it is fairly technical, and the reasoning behind its conclusions is not always intuitively obvious (and you need to understand the reasoning to make intelligent choices).
Usually for built-in types you can just pass by value. They're small types.
For user defined types (or templates, when you don't what is going to be passed) prefer const&. The size of a reference is probably smaller than the size of the type. And it won't incurr an extra copy (no call to a copy constructor).
Well, yes ... the other answers about efficiency are true. But there's something else going on here which is important - passing a class by value creates a copy and, therefore, invokes the copy constructor. If you're doing fancy stuff there, it's another reason to use references.
A reference to const T is not worth the typing effort in case of scalar types like int, double, etc. The rule of thumb is that class-types should be accepted via ref-to-const. But for iterators (which could be class-types) we often make an exception.
In generic code you should probably write "T const&" most of the time to be on the safe side. There's also boost's call traits you can use to select the most promising parameter passing type. It basically uses ref-to-const for class types and pass-by-value for scalar types as far as I can tell.
But there are also situations where you might want to accept parameters by value, regardless of how expensive creating a copy can be. See Dave's article "Want Speed? Use pass by value!".
For simple types like int, double and char*, it makes sense to pass it by value. For more complex types, I use const T& unless there is a specific reason not to.
The cost of passing a 4 - 8 byte parameter is as low as you can get. You don't buy anything by passing a reference. For larger types, passing them by value can be expensive.
It won't make any difference for an int, as when you use a reference the memory address still has to be passed, and the memory address (void*) is usually about the size of an integer.
For types that contain a lot of data it becomes far more efficient as it avoids the huge overhead from having to copy the data.
Well the difference between the two doesn't really mean much for ints.
However, when using larger structures (or objects), the first method you used, pass by const reference, gives you access to the structure without need to copy it. The second case pass by value will instantiate a new structure that will have the same value as the argument.
In both cases you see this in the caller
myFunct(item);
To the caller, item will not be changed by myFunct, but the pass by reference will not incur the cost of creating a copy.
There is a very good answer to a similar question over at Pass by Reference / Value in C++
The difference between them is that one passes an int (which gets copied), and one uses the existing int. Since it's a const reference, it doesn't get changed, so it works much the same. The big difference here is that the function can alter the value of the int locally, but not the const reference. (I suppose some idiot could do the same thing with const_cast<>, or at least try to.) For larger objects, I can think of two differences.
First, some objects simply can't get copied, auto_ptr<>s and objects containing them being the obvious example.
Second, for large and complicated objects it's faster to pass by const reference than to copy. It's usually not a big deal, but passing objects by const reference is a useful habit to get into.
Either works fine. Don't waste your time worrying about this stuff.
The only time it might make a difference is when the type is a large struct, which might be expensive to pass on the stack. In that case, passing the arg as a pointer or a reference is (slightly) more efficient.
The problem appears when you are passing objects. If you pass by value, the copy constructor will be called. If you haven't implemented one, then a shallow copy of that object will be passed to the function.
Why is this a problem? If you have pointers to dynamically allocated memory, this could be freed when the destructor of the copy is called (when the object leaves the function's scope). Then, when you re call your destructor, youll have a double free.
Moral: Write your copy constructors.
If I have a functor class with no state, but I create it from the heap with new, are typical compilers smart enough to optimize away the creation overhead entirely?
This question has come up when making a bunch of stateless functors. If they're allocated on the stack, does their 0 state class body mean that the stack really isn't changed at all? It seems it must in case you later take an address of the functor instance.
Same for heap allocation.
In that case, functors are always adding a (trivial, but non-zero) overhead in their creation. But maybe compilers can see whether the address is used and if not it can eliminate that stack allocation. (Or, can it even eliminate a heap allocation?)
But how about a functor that's created as a temporary?
#include <iostream>
struct GTfunctor
{
inline bool operator()(int a, int b) {return a>b; }
};
int main()
{
GTfunctor* f= new GTfunctor;
GTfunctor g;
std::cout<< (*f)(2,1) << std::endl;
std::cout<< g(2,1) << std::endl;
std::cout<< GTfunctor()(2,1) << std::endl;
delete f;
}
So in the concrete example above, the three lines each call the same functor in three different ways. In this example, is there any efficiency difference between the ways? Or is the compiler able to optimize each line all the way down to being a compute-less print statement?
Edit:
Most answers say that the compiler could never inline/eliminate the heap allocated functor. But is this really true as well? Most compilers (GCC, MS, Intel) have linktime optimization as well which could indeed do this optimization. (but does it?)
are typical compilers smart enough to optimize away the creation overhead entirely?
When you're creating them on the heap, I doubt whether the compiler is allowed to. IMO:
Invoking new implies invoking operator new.
operator new is a non-trivial function defined in the run-time library.
The compiler isn't allowed to decide that you didn't really mean to invoke such a function and to decide that as an optimization it will silently not invoke it.
When you're creating them on the stack, and not taking their address, then maybe ... or maybe not: my guess is that every object has a non-zero size, in order to occupy some memory, in order to have an identity, even when the object has no state apart from its identity.
Obviously, it depends on your compiler.
I would say
No compiler will optimize away the object on the heap. (This is because, as ChrisW says, compilers will never optimize away a call to new, which is almost certainly defined in another translation unit.)
Some compilers will optimize away a named object on the stack. I've known gcc to do this optimization quite often.
Most compilers will optimize away an unnamed object on the stack. This is one of the "standard" C++ optimizations, especially as more advanced C++ users tend to create lots of unnamed temporary variables.
Unfortunately, these are only rules of thumb. Optimizers are notoriously unpredictable; really the only way to know what your compiler is doing is to read the assembly output.
I highly doubt this type of optimization is allowed, but if your functor has no state, why would you want to initialize it on the heap? It should be just as easy to use it as a temporary.
A C++ object is always non-zero in size. "Empty base class optimization" allows empty base class to have zero size but that doesn't apply here.
I have not worked on any C++ optimizer, so whatever i say is just speculating. I think 2nd and 3rd will be expanded inline easily and there will be no overhead, and no GTFunctor is created. The functor pointer, however, is a different story. In your example, it may seem simple enough and any optimizer should be able to eliminate heap allocation, but in a non trivial program, you maybe creating the functors in one translation unit and use it in another. Or even in a different library where the compiler/linker/loader/runtime system don't have source code to, and it is almost impossible to optimize. Given the fact that optimizing it is not easy, the potential gain in performance is not great, and the number of cases where empty functor is allocated in the heap is probably small, i think most optimizer programmer will probably not put this optimization high in their to do list.
The compiler cannot optimize out a call to new or delete. It may however optimize out the variable created on the stack since it has no state.
Simple way to answer the heap question:
GTfunctor *f = new GTfunctor;
The value of f must not be null, so what should it be? And you also had:
GTfunctor *g = new GTfunctor;
Now the value of g must not equal the value of f, so what should each be?
Furthermore, neither f or g may be equal to any other pointer obtained from new, unless some pointer elsewhere is somehow initialised to be equal to f or g, which (depending on the code that comes after) may involve examining what the entire rest of the program does.
Yes, if by local examination of the code the compiler can see that you never rely on any of these requirements, then it could perform a rewrite such that no heap allocation occurs. The problem is, if your code was that simple, you could probably do that rewrite yourself and end up with a more readable program anyway, e.g. your test program would look like your stack-based g example. So real programs would not benefit from such an optimisation in the compiler.
Presumably the reason you're doing this is because sometimes the functor does have data, depending on which type is chosen at runtime. So compile-time analysis cannot usefully work its magic here.
C++ Standard states that each object (imho on the heap) must at least have a size one byte, so it can be uniquely addressed.
Generating functors with new can lead to two problems:
The constructions can generally not optimized away. New is a function with complex side effects (bad_alloc).
Because you address the functor indirectly the compiler may not be able to inline the function.
Chances are good that you will not see a sign of the functor, if you generate it on the stack.
Side note: The inline statement is not necessary. Every function which is defined in a class definition is treated as inlineable.
The compiler can probably figure out that operator() doesn't use any member variables, and optimize it to the max. I wouldn't make any assumptions about the local or heap allocated variables, though.
Edit: When in doubt, turn on the assembly output option on your compiler and see what it's actually doing. No sense in listening to a bunch of idiots on the web when you can see the real answer for yourself.
The answer to your question has two aspects.
Does the compiler optimize away the heap allocation: I strongly doubt it, but I'm not a standard guy, so I have to look it up.
Can the compiler optimize by inline the object's operator()? Yes. As long as you don't specify the call as virtual, even the pointer dereferencing isn't actually performed.