How do I know when to use `const ref` or `in`? - d

void foo(T, size_t size)(in T[size] data){...}
//vs
void foo(T, size_t size)(const ref T[size] data){...}
According to https://stackoverflow.com/a/271344/944430 it seems that in C++ pass by value can be faster in some situations.
But D has a special keyword in and I am wondering when I should use it. Does in always result in a copy or is it a compiler optimization?
Are there any guidelines that I can follow that help me decide between const ref and in?

I would argue that you should never use in on function parameters. in is an artifact from D1 that was kept to reduced code breakage but was changed to be equivalent to const scope. So, every time you think of typing in on a function parameter, think of const scope, since that's what you're really doing. And scope currently only does anything with delegates, in which case, it's telling the compiler that the function taking the delegate is not going to return it or assign it to anything and that therefore no closure has to be allocated to hold the state of that delegate (so, it improves efficiency in many cases for delegates), whereas for all other types, it's completely ignored, which means that using it is meaningless (and potentially confusing), and if it ever does come to mean something else for other types (e.g. it's been suggested that it should enforce that a pointer that's passed in as scope can't escape the function), then the semantics of your code could change in unexpected ways. Presumably, it'll be accompanied by the appropriate warnings when the happens, but why mark your code with a meaningless attribute that could have meaning later and thus force you to change your code? At this point, scope should only be used on delegates, so in should only be used on delegates, and you don't usually want const delegates. So, just don't use in.
So, ultimately, what you're really asking is whether you should use const or const ref.
The short answer is that you generally shouldn't use ref unless you want to mutate the argument you're passing in. I would also point out that this question is meaningless for anything but structs and maybe static arrays, because classes are already reference types, and none of the built-in types (save for static arrays) cost much of anything to copy. The longer answer is...
Move semantics are built into D, so if you have a function that takes its argument by value - e.g.
auto foo(Bar bar) { ... }
then it will move the argument if it can. If you pass it an lvalue (a value that can be on the left-hand side of an assignment), then that value is going to be copied except maybe in circumstances where the compiler is able to determine that it can optimize the copy away (e.g. when the variable is never used after that function call), but that's going to depend on the compiler and compiler flags used. So, passing a variable to a function by value will usually result in a copy. However, if you pass the function an rvalue (the values that can't go on the left-hand side of an assignment), then it will move that object rather than copying it. This is different from C++, where move semantics were not introduced until C++11, and even then, they require move constructors, whereas D uses postlbit constructors, which changes it so that moves can be done by default. A couple of previous SO questions on that:
Does D have something akin to C++0x's move semantics?
Questions about postblit and move semantics
So, yes, there are cases in D where passing by ref would avoid a copy, but in D, ref always requires an lvalue (even with const). So, if you start putting ref const(T) everywhere like you'd do const T& in C++, you're going to have a lot of functions which are really annoying to call, because every temporary will have to be assigned to a variable first to call the function. So, you should seriously consider only ever using ref for when you want to mutate a variable that's passed in and not for efficiency. Certainly, your default should be to not pass by const ref, but if you do need that extra efficiency, you have two options:
Overload the function on ref-ness so that you have an overload that takes by const ref and one that takes by ref so that the lvalues get passed to one without being copied, and the rvalues get passed to the other without needing an extraneous variable. e.g.
auto foo(const Bar bar) { foo(bar); }
auto foo(ref const(Bar) bar) { ... }
And that's a bit annoying but works well enough when you only have one parameter with ref. However, you get a combinatorial explosion of overloads as more ref parameters are added. e.g.
auto foo(const Bar bar, const Glop glop) { foo(bar, glop); }
auto foo(ref const(Bar) bar, const Glop glop) { foo(bar, glop); }
auto foo(const Bar bar, ref const(Glop) glop) { foo(bar, glop); }
auto foo(ref const(Bar) bar, ref const(Glop) glop) { ... }
So, that works to a point, but it's not particularly pleasant. And if you define the overloads like I did here, then it also has the downside that the rvalues end up being passed to a wrapper function (adding an extra function call - though one that should be quite inlinable), which means that they're now passed by ref to the main overload and if one of those parameters is passed to another function or returned, the compiler can't do a move, whereas if ref hadn't been involved, then it could have. That's one of the reasons that it's now argued that you shouldn't use const T& heavily in C++11 like you would have done in C++98.
You can get around that problem by duplicating the function body for each overload, but that obviously creates a maintenance problem as well as creating code bloat.
The alternative is to use auto ref, which basically does that for you, but the function has to be templated. e.g.
auto foo()(const auto ref Bar bar, const auto ref Glop glop) { ... }
So, now you only have one overload, but it still generates all of those overloads with the full code underneath the hood every time the template is instantiated with a different combination of ref-ness. So, your code is cleaner, but you still get more bloat, and if you need to do this with a virtual function, then you're out of luck and have to go back to the more explicit overload solution, because templated functions can't be virtual.
So, in general, trying to have your functions accept const ref for efficiency reasons just gets ugly. The fact that D has move semantics built in reduces the need for it (just like with C++11, it's now argued that passing by value is often better, thanks to move semantics and how the compiler optimizes them). And it's ugly enough to do in D in the general case that unless you actually get a performance boost that matters, it's probably not worth passing by ref just for efficiency. You should probably avoid using ref for efficiency unless you've actually measured a difference in performance that's worth the pain.
The other thing to consider - separate from ref-ness - is that D's const is a lot more restrictive than C++'s const (e.g. casting away const and mutating is undefined behavior in D, and D's const is transitive). So, slapping const all over the place can sometimes become problematic - especially in generic code. So, using it can be great for preventing accidental mutation or indicating that a function does not mutate its arguments, but don't just blithely slap it everywhere that shouldn't be mutating the variable like you would in C++. Use it where it makes sense, but be aware that you will run into cases where D's const is too restrictive to be usable, even if C++'s const would have worked.
So, in most cases, when you want your function to take a T, you should default to it taking a plain T. And then if you know that efficiency is a concern, you can consider using some form of ref (probably favoring auto ref or const auto ref if you're not dealing with a virtual function). But default to not using ref. Your life will be much more pleasant that way.

Related

What to do when library function parameters aren't const

Most of my code-base is immutable; however, due to quirks of the language design, I'm unable to mark my variables const.
In a vast majority of cases, especially when inter-operating with C code, I find function parameters not marked const, even though they provably do not modify them.
One such example is fts_open(...). At this point the compiler forces me to tediously remove const qualifiers from large parts of my code, and thereby removing the safety it offered.
One trivial solution is to compile with -fpermissive, but this is completely contrary to my intent.
Apart from rewriting every single C library ever written, what can I do to still get the benefits from leaning on the compiler?
i.e. this type of code does not work:
void function(immutable_type const &param)
{
char const * const fts_arg[2]{std::data(param.path), nullptr};
FTS *tree = fts_open(fts_arg, FTS_OPTIONS, nullptr);
...
}
At this point I have to:
Remove const from the fts_args variable.
Remove const from the function parameter.
Remove const from path inside the datatype definition.
Remove const from the variable being passed to function.
Recursively remove consts from the entire call chain.
Thank you. :)
This is exactly what const_cast is for. If you absolutely know that a function won't change the pointed/referenced object, then it is OK to const_cast the constness of your pointer/reference away in order to pass it into that function, despite referring to a const object.
isn't const_casting from const undefined behavior?
No. const_cast itself is never UB. But modifying a const object is. So if you cannot prove that a function taking a non-const pointer/reference doesn't modify the object, then it is not safe to pass the const_casted reference into that function.
Also consider whether the implementation might be changed in future to use non-constness.
In case where you cannot prove that the non-constly referred/pointed object won't be modified, you can make a local copy of the constly referred argument of your wrapper function. The overhead of this copy may be trivial (int) or non-trivial (long std::vector).
If you cannot prove that the object won't be modified, and copying is expensive (or not possible), then as last resort, you have to get rid of constness of your own argument (and propagate the change up the call chain). Or use another API in the implementation.

Pass by value vs pass by rvalue reference

When should I declare my function as:
void foo(Widget w);
as opposed to:
void foo(Widget&& w);?
Assume this is the only overload (as in, I pick one or the other, not both, and no other overloads). No templates involved. Assume that the function foo requires ownership of the Widget (e.g. const Widget& is not part of this discussion). I'm not interested in any answer outside the scope of these circumstances. (See addendum at end of post for why these constraints are part of the question.)
The primary difference that my colleagues and I can come up with is that the rvalue reference parameter forces you to be explicit about copies. The caller is responsible for making an explicit copy and then passing it in with std::move when you want a copy. In the pass by value case, the cost of the copy is hidden:
//If foo is a pass by value function, calling + making a copy:
Widget x{};
foo(x); //Implicit copy
//Not shown: continues to use x locally
//If foo is a pass by rvalue reference function, calling + making a copy:
Widget x{};
//foo(x); //This would be a compiler error
auto copy = x; //Explicit copy
foo(std::move(copy));
//Not shown: continues to use x locally
Other than forcing people to be explicit about copying and changing how much syntactic sugar you get when calling the function, how else are these different? What do they say differently about the interface? Are they more or less efficient than one another?
Other things that my colleagues and I have already thought of:
The rvalue reference parameter means that you may move the argument, but does not mandate it. It is possible that the argument you passed in at the call site will be in its original state afterwards. It's also possible the function would eat/change the argument without ever calling a move constructor but assume that because it was an rvalue reference, the caller relinquished control. Pass by value, if you move into it, you must assume that a move happened; there's no choice.
Assuming no elisions, a single move constructor call is eliminated with pass by rvalue.
The compiler has better opportunity to elide copies/moves with pass by value. Can anyone substantiate this claim? Preferably with a link to gcc.godbolt.org showing optimized generated code from gcc/clang rather than a line in the standard. My attempt at showing this was probably not able to successfully isolate the behavior: https://godbolt.org/g/4yomtt
Addendum: why am I constraining this problem so much?
No overloads - if there were other overloads, this would devolve into a discussion of pass by value vs a set of overloads that include both const reference and rvalue reference, at which point the set of overloads is obviously more efficient and wins. This is well known, and therefore not interesting.
No templates - I'm not interested in how forwarding references fit into the picture. If you have a forwarding reference, you call std::forward anyway. The goal with a forwarding reference is to pass things as you received them. Copies aren't relevant because you just pass an lvalue instead. It's well known, and not interesting.
foo requires ownership of Widget (aka no const Widget&) - We're not talking about read-only functions. If the function was read-only or didn't need to own or extend the lifetime of the Widget, then the answer trivially becomes const Widget&, which again, is well known, and not interesting. I also refer you to why we don't want to talk about overloads.
What do rvalue usages say about an interface versus copying?
rvalue suggests to the caller that the function both wants to own the value and has no intention of letting the caller know of any changes it has made. Consider the following (I know you said no lvalue references in your example, but bear with me):
//Hello. I want my own local copy of your Widget that I will manipulate,
//but I don't want my changes to affect the one you have. I may or may not
//hold onto it for later, but that's none of your business.
void foo(Widget w);
//Hello. I want to take your Widget and play with it. It may be in a
//different state than when you gave it to me, but it'll still be yours
//when I'm finished. Trust me!
void foo(Widget& w);
//Hello. Can I see that Widget of yours? I don't want to mess with it;
//I just want to check something out on it. Read that one value from it,
//or observe what state it's in. I won't touch it and I won't keep it.
void foo(const Widget& w);
//Hello. Ooh, I like that Widget you have. You're not going to use it
//anymore, are you? Please just give it to me. Thank you! It's my
//responsibility now, so don't worry about it anymore, m'kay?
void foo(Widget&& w);
For another way of looking at it:
//Here, let me buy you a new car just like mine. I don't care if you wreck
//it or give it a new paint job; you have yours and I have mine.
void foo(Car c);
//Here are the keys to my car. I understand that it may come back...
//not quite the same... as I lent it to you, but I'm okay with that.
void foo(Car& c);
//Here are the keys to my car as long as you promise to not give it a
//paint job or anything like that
void foo(const Car& c);
//I don't need my car anymore, so I'm signing the title over to you now.
//Happy birthday!
void foo(Car&& c);
Now, if Widgets have to remain unique (as actual widgets in, say, GTK do) then the first option cannot work. The second, third and fourth options make sense, because there's still only one real representation of the data. Anyway, that's what those semantics say to me when I see them in code.
Now, as for efficiency: it depends. rvalue references can save a lot of time if Widget has a pointer to a data member whose pointed-to contents can be rather large (think an array). Since the caller used an rvalue, they're saying they don't care about what they're giving you anymore. So, if you want to move the caller's Widget's contents into your Widget, just take their pointer. No need to meticulously copy each element in the data structure their pointer points to. This can lead to pretty good improvements in speed (again, think arrays). But if the Widget class doesn't have any such thing, this benefit is nowhere to be seen.
Hopefully that gets at what you were asking; if not, I can perhaps expand/clarify things.
The rvalue reference parameter forces you to be explicit about copies.
Yes, pass-by-rvalue-reference got a point.
The rvalue reference parameter means that you may move the argument, but does not mandate it.
Yes, pass-by-value got a point.
But that also gives to pass-by-rvalue the opportunity to handle exception guarantee: if foo throws, widget value is not necessary consumed.
For move-only types (as std::unique_ptr), pass-by-value seems to be the norm (mostly for your second point, and first point is not applicable anyway).
EDIT: standard library contradicts my previous sentence, one of shared_ptr's constructor takes std::unique_ptr<T, D>&&.
For types which have both copy/move (as std::shared_ptr), we have the choice of the coherency with previous types or force to be explicit on copy.
Unless you want to guarantee there is no unwanted copy, I would use pass-by-value for coherency.
Unless you want guaranteed and/or immediate sink, I would use pass-by-rvalue.
For existing code base, I would keep consistency.
Unless the type is a move-only type you normally have an option to pass by reference-to-const and it seems arbitrary to make it "not part of the discussion" but I will try.
I think the choice partly depends on what foo is going to do with the parameter.
The function needs a local copy
Let's say Widget is an iterator and you want to implement your own std::next function. next needs its own copy to advance and then return. In this case your choice is something like:
Widget next(Widget it, int n = 1){
std::advance(it, n);
return it;
}
vs
Widget next(Widget&& it, int n = 1){
std::advance(it, n);
return std::move(it);
}
I think by-value is better here. From the signature you can see it is taking a copy. If the caller wants to avoid a copy they can do a std::move and guarantee the variable is moved from but they can still pass lvalues if they want to.
With pass-by-rvalue-reference the caller cannot guarantee that the variable has been moved from.
Move-assignment to a copy
Let's say you have a class WidgetHolder:
class WidgetHolder {
Widget widget;
//...
};
and you need to implement a setWidget member function. I'm going to assume you already have an overload that takes a reference-to-const:
WidgetHolder::setWidget(const Widget& w) {
widget = w;
}
but after measuring performance you decide you need to optimize for r-values. You have a choice between replacing it with:
WidgetHolder::setWidget(Widget w) {
widget = std::move(w);
}
Or overloading with:
WidgetHolder::setWidget(Widget&& widget) {
widget = std::move(w);
}
This one is a little bit more tricky. It is tempting choose pass-by-value because it accepts both rvalues and lvalues so you don't need two overloads. However it is unconditionally taking a copy so you can't take advantage of any existing capacity in the member variable. The pass by reference-to-const and pass by r-value reference overloads use assignment without taking a copy which might be faster
Move-construct a copy
Now lets say you are writing the constructor for WidgetHolder and as before you have already implemented a constructor that takes an reference-to-const:
WidgetHolder::WidgetHolder(const Widget& w) : widget(w) {
}
and as before you have measured peformance and decided you need to optimize for rvalues. You have a choice between replacing it with:
WidgetHolder::WidgetHolder(Widget w) : widget(std::move(w)) {
}
Or overloading with:
WidgetHolder::WidgetHolder(Widget&& w) : widget(std:move(w)) {
}
In this case, the member variable cannot have any existing capacity since this is the constructor. You are move-constucting a copy. Also, constructors often take many parameters so it can be quite a pain to write all the different permutations of overloads to optimize for r-value references. So in this case it is a good idea to use pass-by-value, especially if the constructor takes many such parameters.
Passing unique_ptr
With unique_ptr the efficiency concerns are less important given that a move is so cheap and it doesn't have any capacity. More important is expressiveness and correctness. There is a good discussion of how to pass unique_ptr here.
When you pass by rvalue reference object lifetimes get complicated. If the callee does not move out of the argument, the destruction of the argument is delayed. I think this is interesting in two cases.
First, you have an RAII class
void fn(RAII &&);
RAII x{underlying_resource};
fn(std::move(x));
// later in the code
RAII y{underlying_resource};
When initializing y, the resource could still be held by x if fn doesn't move out of the rvalue reference. In the pass by value code, we know that x gets moved out of, and fn releases x. This is probably a case where you would want to pass by value, and the copy constructor would likely be deleted, so you wouldn't have to worry about accidental copies.
Second, if the argument is a large object and the function doesn't move out, the lifetime of the vectors data is larger than in the case of pass by value.
vector<B> fn1(vector<A> &&x);
vector<C> fn2(vector<B> &&x);
vector<A> va; // large vector
vector<B> vb = fn1(std::move(va));
vector<C> vc = fn2(std::move(vb));
In the example above, if fn1 and fn2 don't move out of x, then you will end up with all of the data in all of the vectors still alive. If you instead pass by value, only the last vector's data will still be alive (assuming vectors move constructor clears the sources vector).
One issue not mentioned in the other answers is the idea of exception-safety.
In general, if the function throws an exception, we would ideally like to have the strong exception guarantee, meaning that the call has no effect other than raising the exception. If pass-by-value uses the move constructor, then such an effect is essentially unavoidable. So an rvalue-reference argument may be superior in some cases. (Of course, there are various cases where the strong exception guarantee isn't achievable either way, as well as various cases where the no-throw guarantee is available either way. So this is not relevant in 100% of cases. But it's relevant sometimes.)
Choosing between by-value and by-rvalue-ref, with no other overloads, is not meaningful.
With pass by value the actual argument can be an lvalue expression.
With pass by rvalue-ref the actual argument must be an rvalue.
If the function is storing a copy of the argument, then a sensible choice is between pass-by-value, and a set of overloads with pass-by-ref-to-const and pass-by-rvalue-ref. For an rvalue expression as actual argument the set of overloads can avoid one move. It's an engineering gut-feeling decision whether the micro-optimization is worth the added complexity and typing.
One notable difference is that if you move to an pass-by-value function:
void foo(Widget w);
foo(std::move(copy));
compiler must generate a move-constructor call Widget(Widget&&) to create the value object. In case of pass-by-rvalue-reference no such call is needed as the rvalue-reference is passed directly to the method. Usually this does not matter, as move constructors are trivial (or default) and are inlined most of the time.
(you can check it on gcc.godbolt.org -- in your example declare move constructor Widget(Widget&&); and it will show up in assembly)
So my rule of thumb is this:
if the object represents a unique resource (without copy semantics) I prefer to use pass-by-rvalue-reference,
otherwise if it logically makes sense to either move or copy the object, I use pass-by-value.

Any way to pass an rvalue/temp object to function that expects a non-cost reference?

I understand that c++ only allows rvalues or temp objects to bind to const-references. (Or something close to that...)
For example, assuming I have the functions doStuff(SomeValue & input)
and SomeValue getNiceValue() defined:
/* These do not work */
app->doStuff(SomeValue("value1"));
app->doStuff(getNiceValue());
/* These all work, but seem awkward enough that they must be wrong. :) */
app->doStuff(*(new SomeValue("value2")));
SomeValue tmp = SomeValue("value3");
app->doStuff(tmp);
SomeValue tmp2 = getNiceValue();
app->doStuff(tmp2);
So, three questions:
Since I am not free to change the signatures of doStuff() or getNiceValue(), does this mean I must always use some sort of "name" (even if superfluous) for anything I want to pass to doStuff?
Hypothetically, if I could change the function signatures, is there a common pattern for this sort of thing?
Does the new C++11 standard change the things at all? Is there a better way with C++11?
Thank you
An obvious question in this case is why your doStuff declares its parameter as a non-const reference. If it really attempts to modify the referred object, then changing function signature to a const reference is not an option (at least not by itself).
Anyway, "rvalue-ness" is a property of an expression that generated the temporary, not a property of temporary object itself. The temporary object itself can easily be an lvalue, yet you see it as an rvalue, since the expression that produced it was an rvalue expression.
You can work around it by introducing a "rvalue-to-lvalue converter" method into your class. Like, for example
class SomeValue {
public:
SomeValue &get_lvalue() { return *this; }
...
};
and now you can bind non-const references to temporaries as
app->doStuff(SomeValue("value1").get_lvalue());
app->doStuff(getNiceValue().get_lvalue());
Admittedly, it doesn't look very elegant, but it might be seen as a good thing, since it prevents you from doing something like that inadvertently. Of course, it is your responsibility to remember that the lifetime of the temporary extends to the end of the full expression and no further.
Alternatively, a class can overload the unary & operator (with natural semantics)
class SomeValue {
public:
SomeValue *operator &() { return this; }
...
};
which then can be used for the same purpose
app->doStuff(*&SomeValue("value1"));
app->doStuff(*&getNiceValue());
although overriding the & operator for the sole purpose of this workaround is not a good idea. It will also allow one to create pointers to temporaries.
Since I am not free to change the signatures of doStuff() or getNiceValue(), does this mean I must always use some sort of "name"
(even if superfluous) for anything I want to pass to doStuff?
Pretty much yes. This signature assumes that you want to use input as an "out" parameter. So the author of doStuff believes that if the client passes an anonymous object in, that is a logical error best caught at compile time.
Hypothetically, if I could change the function signatures, is there a common pattern for this sort of thing?
In C++11 only, you could change or overload like so:
doStuff(SomeValue&& input);
Now input will only bind to an rvalue. If you've overloaded, then the original will get the lvalues and your new overload will get the rvalues.
Does the new C++11 standard change the things at all? Is there a better way with C++11?
Yes, see the rvalue reference overload above.
std::forward is usually the way to 'convert' value category. However it is prohibited to accept rvalues when forwarding as an lvalue, for the same reasons that a reference to non-const won't bind to an rvalue. That being said, and assuming you don't want to overload doStuff (otherwise see Hinnant's answer), you can write a utility yourself:
template<typename T>
T& unsafe_lvalue(T&& ref)
{ return ref; }
And use it like so: app->doStuff(unsafe_lvalue(getNiceValue())). No intrusive modification needed.
You must always use a name for values you pass to doStuff. The reasons for this are covered in detail at How come a non-const reference cannot bind to a temporary object?. The short summary is that passing a reference implies that doStuff can change the value that it references, and that changing the value of a reference to a temporary is something that the compiler should not let you do.
I'd avoid the first solution, because it allocates memory on the heap that is never freed.
The common pattern for solving this is to change doStuff's signature to take a const reference.
Unfortunately, I think the answer is that you have to have a named object to pass into doStuff. I don't think there is a C++11 feature that gives you flexibility in this area, nor have I heard of any design pattern for this type of situation.
If this is something you expect to encounter often in your program, I would write an interface more suited to the needs of the current app. If it's just a one off, I would tend to just create a temp object to store the result (as you have done).

Why would you pass an object by value in C++ [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is it better in C++ to pass by value or pass by constant reference?
I'm aware of the differences of passing by value, pointer and reference in C++, and I'd consider passing objects by value (instead of const reference) in C++ to be almost always a programming error.
void foo(Obj o); ... // Bad
void foo(const Obj &o); ... // Better
The only case I can think of where it might be appropriate to pass by value instead of const reference is where the object is smaller than a reference, and passing by value is therefore more efficient.
But, surely this is the sort of thing that compilers are built to determine?
Why does C++ actually need pass by value AND pass by const reference, and - are compilers allowed to automatically convert the call to (and from) a const reference if appropriate?
(There seem to be 100s of C++ calling convention question, asking about the differences between (say) value and reference - but I couldn't find one that asked "why?".)
The question of when passing by value might be better than by const reference has different answers with different versions of the standard.
In the good old C++03, and a few years ago, the recommendation would be to pass anything that does not fit in a register by const reference. In this case, the answer would be:
Because Obj fits in a register and passing by value and passing by value will be more efficient
Still in C++03, in the last years (absurd as it seems some articles recommended this almost 10 years back, but there was no real consensus),
if the function needs to make a copy, then doing so in the interface allows the compiler to perform copy-elision if the source for the copy is a temporary, so it can be more efficient.
With the approval of the new C++11 standard, and increasing compiler support for rvalue-references, in many cases even when the copy cannot be elided, and again
if the function needs to make a copy, even when the copy cannot be elided, and for types that support it, the contents will be moved (in common jargon the object will be moved, but it is only the contents that get shifted), which again will be more efficient than copying internally.
As of the question of why the two different calling conventions, they have different goals. Passing by value allows the function to modify the state of the argument without interfering with the source object. Additionally, the state of the source object will not interfere with the function either (consider a multithreaded environment, and a thread modifying the source while the function is still executing).
Certainly one reason C++ has pass-by-value is because it inherited it from C, and removing that could break code for little gain.
Secondly as you note, for types that are smaller than a reference passing by value would be less efficient.
Another less obvious case however is if you have a function that needs a copy of its argument for some reason:
void foo(const Obj& obj)
{
if(very_rare_check()) return;
Obj obj_copy(obj);
obj_copy.do_work();
}
In this case note that you're forcing a copy. But suppose you call this function with the result of another function that returns by value:
Obj bar() { return Obj(parameters); }
And call it thusly: foo(bar());
Now when you use the const reference version, the compiler will end up making two objects: The temporary, and the copy in foo. If however you passed by value the compiler can optimize away all the temporaries to the location used by the by-value parameter of foo.
There's a great article about this and move semantics in general at http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/
Finally the canonical way to implement certain operators is to use pass-by-value to avoid copies inside the operator:
Obj operator+(Obj left, const Obj& right)
{
return left += right;
}
Note how this lets the compiler generate the copy in the parameter rather than forcing a copy or temporary object within the operator's code itself.
If I wanted to do things to the object within the function without affecting the original, I would pass by value:
A minus(A b){
b.val=-b.val;
return b;
}
The copy swap idiom uses passe by value to achieve a compiler generated copy.
MyClass& operator=(MyClass value) // pass by value to generate copy
{
value.swap(*this); // Now do the swap part.
return *this;
}
Basically in situations where you will need to modify the parameter but do not want to touch the original. In these situations if you pass by const reference you manually need to create a copy inside the function. This manual steps will prevent certain optimizations that the compiler can perform if you let the compiler handle the copy.
MyClass a;
// Some code
a = MyClass(); // reset the value of a
// compiler can easily elide this copy.
If the object is mutable, passing by value gives the receiver its own copy to use and where sensible change, without affecting the caller's copy - always assuming it's a sufficiently deep copy.
This may simplify thinking in some multi-threaded situations.
Why does C++ actually need pass by value AND pass by const reference, and - are compilers allowed to automatically convert the call to (and from) a const reference if appropriate?
Let me answer the second one first: sometimes.
Compilers are allowed to elide the copy into the parameter, but only if you pass in an rvalue temporary. For example:
void foo(Obj o);
foo((Obj()))); //Extra set of parenthesis are needed to prevent Most Vexing Parse
The copying of the temporary into the argument parameter may be elided (ie: not copied), at the compiler's convenience).
However, this copy will never be elided:
Obj a;
foo(a);
Now, on to the first. C++ needs both because you may want to use both for different things. Pass by value is useful for transferring ownership; this is more important in C++11 where we can move rather than copy objects.

Proper named temporaries and rvalue-reference/move

Prior to C++11, and as a standard programming idiom, temporaries are often assigned to variables to make the code cleaner. For small types a copy is typically made, and for larger types perhaps a reference, such as:
int a = int_func();
T const & obj = obj_func();
some_func( a, obj );
Now, compare this to the inlined form:
some_func( int_func(), obj_func() );
Prior to C++11 this had nearly identical semantic meaning. With the introduction of rvalue-reference and move semantics the above are now entirely different. In particular, by forcing obj to type T const & you have removed the ability to use a move constructor, whereas the inline form the type can be a T&& instead.
Given that the first is a common paradigm, is there anything in the standard that would allow an optimizer to use a move constructor in the first case? That is, could somehow the compiler ignore the binding to a T const & and instead treat it as a T&&, or, as I suspect, would this violate the rules of the abstract machine?
Second part of the question, to do this correctly in C++11 (without eliminating named temporaries) we need to somehow declare a proper rvalue-reference. We can also use the auto keyword. So, what is the proper way to do this? My guess is:
auto&& obj = obj_func();
Part 1:
The compiler is not allowed to implicilty transform obj into a non-const rvalue and thus use a move constructor when calling some_func.
Part 2:
auto&& obj = obj_func();
This will create a non-const reference to the temporary, but it will not be implicitly moved from when calling some_func because obj is an lvalue. To transform it to an rvalue you should use std::move at the call site:
some_func( a, std::move(obj) );
I doubt this could be optimized, since it's not clear whether you will use the temporary again or not:
Foo x = get_foo(); // using best-possible constructor and (N)RVO anyway
do_something(x);
do_something_else(x);
//...
If you're really keen on exploiting move semantics somewhere (but be sure to profile first to see that this really matters), you can make this explicit with move:
Foo y = get_foo();
do_oneoff_thing(std::move(y)); // now y is no longer valid!
I'd say that if something is eligible for moving, then you might as well do the inlining yourself and do without the extra local variable. After all, what good is such a temporary variable if it is only used once? The only scenario that comes to mind is if the last use of the local variable can exploit move semantics, so you could add std::move to the final appearance. That sounds like a maintenance hazard though, and you'd really need a compelling reason to write that.
I don't think that the const & binding to a temporary to extend the lifetime is so common. As a matter of fact, in C++03, in many cases the copy can be elided by just passing by value and calling the function in what you call the inlined form: some_func( int_func(), obj_func() ), so the same problem that you are noticing in C++11 would occur in C++03 (in a slightly different way)
As of the const reference binding, in the event that obj_func() returns an object of type T, the above code is just a cumbersome way of doing T obj = obj_func(); that offers no advantage other than making people wonder why that was needed.
If obj_func() returns a type T1 derived from T, that trick enables you to ignore the exact return type, but that can also be achieved by using the auto keyword, so in either case, what you have is a local named variable obj.
The proper way to pass the obj into the function --if you are finished with it, and the function can move the value from obj to an internal object, would be to actually move:
some_func( a, std::move(obj) );