Can the use of C++11's 'auto' improve performance?

Can the use of C++11's 'auto' improve performance? - c++

I can see why the auto type in C++11 improves correctness and maintainability. I've read that it can also improve performance (Almost Always Auto by Herb Sutter), but I miss a good explanation.
How can auto improve performance?
Can anyone give an example?

auto can aid performance by avoiding silent implicit conversions. An example I find compelling is the following.
std::map<Key, Val> m;
// ...
for (std::pair<Key, Val> const& item : m) {
// do stuff
}
See the bug? Here we are, thinking we're elegantly taking every item in the map by const reference and using the new range-for expression to make our intent clear, but actually we're copying every element. This is because std::map<Key, Val>::value_type is std::pair<const Key, Val>, not std::pair<Key, Val>. Thus, when we (implicitly) have:
std::pair<Key, Val> const& item = *iter;
Instead of taking a reference to an existing object and leaving it at that, we have to do a type conversion. You are allowed to take a const reference to an object (or temporary) of a different type as long as there is an implicit conversion available, e.g.:
int const& i = 2.0; // perfectly OK
The type conversion is an allowed implicit conversion for the same reason you can convert a const Key to a Key, but we have to construct a temporary of the new type in order to allow for that. Thus, effectively our loop does:
std::pair<Key, Val> __tmp = *iter; // construct a temporary of the correct type
std::pair<Key, Val> const& item = __tmp; // then, take a reference to it
(Of course, there isn't actually a __tmp object, it's just there for illustration, in reality the unnamed temporary is just bound to item for its lifetime).
Just changing to:
for (auto const& item : m) {
// do stuff
}
just saved us a ton of copies - now the referenced type matches the initializer type, so no temporary or conversion is necessary, we can just do a direct reference.

Because auto deduces the type of the initializing expression, there is no type conversion involved. Combined with templated algorithms, this means that you can get a more direct computation than if you were to make up a type yourself – especially when you are dealing with expressions whose type you cannot name!
A typical example comes from (ab)using std::function:
std::function<bool(T, T)> cmp1 = std::bind(f, _2, 10, _1); // bad
auto cmp2 = std::bind(f, _2, 10, _1); // good
auto cmp3 = [](T a, T b){ return f(b, 10, a); }; // also good
std::stable_partition(begin(x), end(x), cmp?);
With cmp2 and cmp3, the entire algorithm can inline the comparison call, whereas if you construct a std::function object, not only can the call not be inlined, but you also have to go through the polymorphic lookup in the type-erased interior of the function wrapper.
Another variant on this theme is that you can say:
auto && f = MakeAThing();
This is always a reference, bound to the value of the function call expression, and never constructs any additional objects. If you didn't know the returned value's type, you might be forced to construct a new object (perhaps as a temporary) via something like T && f = MakeAThing(). (Moreover, auto && even works when the return type is not movable and the return value is a prvalue.)

There are two categories.
auto can avoid type erasure. There are unnamable types (like lambdas), and almost unnamable types (like the result of std::bind or other expression-template like things).
Without auto, you end up having to type erase the data down to something like std::function. Type erasure has costs.
std::function<void()> task1 = []{std::cout << "hello";};
auto task2 = []{std::cout << " world\n";};
task1 has type erasure overhead -- a possible heap allocation, difficulty inlining it, and virtual function table invocation overhead. task2 has none. Lambdas need auto or other forms of type deduction to store without type erasure; other types can be so complex that they only need it in practice.
Second, you can get types wrong. In some cases, the wrong type will work seemingly perfectly, but will cause a copy.
Foo const& f = expression();
will compile if expression() returns Bar const& or Bar or even Bar&, where Foo can be constructed from Bar. A temporary Foo will be created, then bound to f, and its lifetime will be extended until f goes away.
The programmer may have meant Bar const& f and not intended to make a copy there, but a copy is made regardless.
The most common example is the type of *std::map<A,B>::const_iterator, which is std::pair<A const, B> const& not std::pair<A,B> const&, but the error is a category of errors that silently cost performance. You can construct a std::pair<A, B> from a std::pair<const A, B>. (The key on a map is const, because editing it is a bad idea)
Both #Barry and #KerrekSB first illustrated these two principles in their answers. This is simply an attempt to highlight the two issues in one answer, with wording that aims at the problem rather than being example-centric.

The existing three answers give examples where using auto helps “makes it less likely to unintentionally pessimize” effectively making it "improve performance".
There is a flip side to the the coin. Using auto with objects that have operators that don't return the basic object can result in incorrect (still compilable and runable) code. For example, this question asks how using auto gave different (incorrect) results using the Eigen library, i.e. the following lines
const auto resAuto = Ha + Vector3(0.,0.,j * 2.567);
const Vector3 resVector3 = Ha + Vector3(0.,0.,j * 2.567);
std::cout << "resAuto = " << resAuto <<std::endl;
std::cout << "resVector3 = " << resVector3 <<std::endl;
resulted in different output. Admittedly, this is mostly due to Eigens lazy evaluation, but that code is/should be transparent to the (library) user.
While performance hasn't been greatly affected here, using auto to avoid unintentional pessimization might be classified as premature optimization, or at least wrong ;).

Related

Perfectly transform a 2-tuple into a pair

I want an utility to transform a std::tuple<T,U> to std::pair<T,U>, while leaving a std::tuple<T, U, V, W...> unchanged.
Furthermore, I want this utility to
be a function object, not a function, so that I can pass it around as I like;
enforce that it's input must be a std::tuple;
taking advantage of move semantics whenever it is possible to do so and it makes sense.
The last point is the difficult part, for me, and it's the reason I'm asking this question.
I know that tuples can store values as well as references (lvalue references as well as rvalue references), so imagine that the sought function, if fed with an rvalue 2-tuple, it could steal resources from the non-reference components, but should leave intact the reference component.
For instance given
A a;
B b;
std::tuple<A, B&> t{a,b};
where I know a is copied in the tuple, while b is not, I think that doing
auto p{to_pair(t)};
should result in a std::pair<A, B&> which holds another copy of a and a reference to the only b that exists so far.
On the other hand, doing
B b;
auto p{to_pair(std::tuple<A, B&>{A{},b})};
should result again in a std::pair<A, B&>, but this would hold the very A{} of which no copy should ever be made; and it would hold a reference to b anyway.
The scenario described above is already enough for me to be in doubt about what to do, let alone thinking of other combinations of &/const&/&&.
Some time ago I had a hard time answering a question about std::tuples and perfect forwarding, but I can never say I've understood them fully.

template<class...Fs>
struct overloaded : Fs... {
using Fs::operator()...;
};
template<class...Fs>
overloaded(Fs&&...)->overloaded<std::decay_t<Fs>...>;
auto move_tuples = []<class...Ts>(std::tuple<Ts...> x){return std::move(x);};
auto to_pair = []<class A, class B>(std::tuple<A, B> x){
return std::pair<A,B>( std::get<0>(std::move(x)), std::get<1>(std::move(x)) );
};
auto your_function = overloaded{move_tuples, to_pair};
I think that is what you want.
I take the arguments by value, then move them to the return value. If they are values, the result is that the contained values are moved. If they are references, they aren't, the references are just propagated.
The get<N> "does the right thing" when passed an rvalue tuple. If returns an rvalue from a value.
In c++17 you'll have to replace the lambdas with manual structs.
Or write two normal overloads, and
auto obj=[](auto&&x)->decltype(normal_func(decltype(x)(x))){return decltype(x)(x);};
to turn the overloaded function call normal_func into a single function object.

Why my STL sort is not sorting the vector of strings? [duplicate]

When creating local variables, is it correct to use (const) auto& or auto?
e.g.:
SomeClass object;
const auto result = object.SomeMethod();
or const auto& result = object.SomeMethod();
Where SomeMethod() returns a non-primitive value - maybe another user-defined type.
My understanding is that const auto& result is correct since the result returned by SomeMethod() would call the copy constructor for the returned type. Please correct me if I am wrong.
What about for primitive types? I assume const auto sum = 1 + 2; is correct.
Does this also apply to range based for loops?
for(const auto& object : objects)

auto and auto && cover most of the cases:
Use auto when you need a local copy. This will never produce a reference. The copy (or move) constructor must exist, but it might not get called, due to the copy elision optimization.
Use auto && when you don't care if the object is local or not. Technically, this will always produce a reference, but if the initializer is a temporary (e.g., the function returns by value), it will behave essentially like your own local object.
Also, auto && doesn't guarantee that the object will be modifiable, either. Given a const object or reference, it will deduce const. However, modifiability is often assumed, given the specific context.
auto & and auto const & are a little more specific:
auto & guarantees that you are sharing the variable with something else. It is always a reference and never to a temporary.
auto const & is like auto &&, but provides read-only access.
What about for primitive/non-primitive types?
There is no difference.
Does this also apply to range based for loops?
Yes. Applying the above principles,
Use auto && for the ability to modify and discard values of the sequence within the loop. (That is, unless the container provides a read-only view, such as std::initializer_list, in which case it will be effectively an auto const &.)
Use auto & to modify the values of the sequence in a meaningful way.
Use auto const & for read-only access.
Use auto to work with (modifiable) copies.
You also mention auto const with no reference. This works, but it's not very commonly used because there is seldom an advantage to read-only access to something that you already own.

Yes, it is correct to use auto and auto& for local variables.
When getting the return type of a function, it is also correct to use auto&. This applies for range based for loops as well.
General rules for using auto are:
Choose auto x when you want to work with copies.
Choose auto &x when you want to work with original items and may modify them.
Choose auto const &x when you want to work with original items and will
not modify them.
You can read more about the auto specifier here.

auto uses the same mechanism of type deduction as templates, the only exception that I am aware of being that of brace-init lists, which are deduced by auto as std::initializer_list, but non-deduced in a template context.
auto x = expression;
works by first stripping all reference and cv qualifiers from the type of the right hand side expression, then matching the type. For example, if you have const int& f(){...} then auto x = f(); deduces x as int, and not const int&.
The other form,
auto& x = expression
does not strip the cv-qualifiers, so, using the example above, auto& x = f() deduces x as const int&. The other combinations just add cv qualifiers.
If you want your type to be always deduced with cv-ref qualifiers, use the infamous decltype(auto) in C++14, which uses the decltype type deduction rules.
So, in a nutshell, if you want copies, use auto, if you want references, use auto&. Use const whenever you want additional const-ness.
EDIT
There is an additional use case,
auto&& x = expression;
which uses the reference-collapsing rules, same as in the case of forwarding references in template code. If expression is a lvalue, then x is a lvalue reference with the cv-qualifiers of expression. If expression is a rvalue, then x is a rvalue reference.

When creating local variables, is it correct to use (const) auto& or auto?
Yes. The auto is nothing more than a compiler-deduced type, so use references where you would normally use references, and local (automatic) copies where you would normally use local copies. Whether or not to use a reference is independent of type deduction.
Where SomeMethod() returns a non-primitive value - maybe another user-defined type. My understanding is that const auto& result is correct since the result returned by SomeMethod() would call the copy constructor for the returned type. Please correct me if I am wrong.
Legal? Yes, with the const. Best practice? Probably not, no. At least, not with C++11. Especially not, if the value returned from SomeMethod() is already a temporary. You'll want to learn about C++11 move semantics, copy elision, and return value optimization:
https://juanchopanzacpp.wordpress.com/2014/05/11/want-speed-dont-always-pass-by-value/
http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=199
https://isocpp.org/wiki/faq/ctors#return-by-value-optimization
What about for primitive types? I assume const auto sum = 1 + 2; is correct.
Yes, this is fine.
Does this also apply to range based for loops?
for(const auto& object : objects)
Yes, this is also fine. I write this sort of code at work all the time.

structured bindings with std::minmax and rvalues

I ran into a rather subtle bug when using std::minmax with structured bindings. It appears that passed rvalues will not always be copied as one might expect. Originally I was using a T operator[]() const on a custom container, but it seems to be the same with a literal integer.
#include <algorithm>
#include <cstdio>
#include <tuple>
int main()
{
auto [amin, amax] = std::minmax(3, 6);
printf("%d,%d\n", amin, amax); // undefined,undefined
int bmin, bmax;
std::tie(bmin, bmax) = std::minmax(3, 6);
printf("%d,%d\n", bmin, bmax); // 3,6
}
Using GCC 8.1.1 with -O1 -Wuninitialized will result in 0,0 being printed as first line and:
warning: ‘<anonymous>’ is used uninitialized in this function [-Wuninitialized]
Clang 6.0.1 at -O2 will also give a wrong first result with no warning.
At -O0 GCC gives a correct result and no warning. For clang the result appears to be correct at -O1 or -O0.
Should not the first and second line be equivalent in the sense that the rvalue is still valid for being copied?
Also, why does this depend on the optimization level? Particularly I was surprised that GCC issues no warning.

What's important to note in auto [amin, amax] is that the auto, auto& and so forth are applied on the made up object e that is initialized with the return value of std::minmax, which is a pair. It's essentially this:
auto e = std::minmax(3, 6);
auto&& amin = std::get<0>(e);
auto&& amax = std::get<1>(e);
The actual types of amin and amax are references that refer to whatever std::get<0> and std::get<1> return for that pair object. And they themselves return references to objects long gone!
When you use std::tie, you are doing assignment to existing objects (passed by reference). The rvalues don't need to live longer than the assignment expressions in which they come into being.
As a work around, you can use something like this (not production quality) function:
template<typename T1, typename T2>
auto as_value(std::pair<T1, T2> in) {
using U1 = std::decay_t<T1>;
using U2 = std::decay_t<T2>;
return std::pair<U1, U2>(in);
}
It ensures the pair holds value types. When used like this:
auto [amin, amax] = as_value(std::minmax(3, 6));
We now get a copy made, and the structured bindings refer to those copies.

There are two fundamental issues going on here:
min, max, and minmax for historic reasons return references. So if you pass in a temporary, you'd better take the result by value or immediately use it, otherwise you get a dangling reference. If minmax gave you a pair<int, int> here instead of a pair<int const&, int const&>, you wouldn't have any problems.
auto decays top-level cv-qualifiers and strips references, but it doesn't remove all the way down. Here, you're deducing that pair<int const&, int const&>, but if we had deduced pair<int, int>, we would again not have any problems.
(1) is a much easier problem to solve than (2): write your own functions to take everything by value:
template <typename T>
std::pair<T, T> minmax(T a, T b) {
return (b < a) ? std::pair(b, a) : std::pair(a, b);
}
auto [amin, amax] = minmax(3, 6); // no problems
The nice thing about taking everything by value is that you never have to worry about hidden dangling references, because there aren't any. And the vast majority of uses of these functions are using integral types anyway, so there's no benefit to references.
And when you do need references, for when you're comparing expensive-to-copy objects... well, it's easier to take a function that takes values and force it to use references than it is to take a function that uses references and try to fix it:
auto [lo, hi] = minmax(std::ref(big1), std::ref(big2));
Additionally, it's very visible here at the call site that we're using references, so it would be much more obvious if we messed up.
While the above works for lots of types due to reference_wrapper<T>'s implicit conversion to T&, it won't work for those types that have non-member, non-friend, operator templates (like std::string). So you'd additionally need to write a specialization for reference wrappers, unfortunately.

C++ auto& vs auto

When creating local variables, is it correct to use (const) auto& or auto?
e.g.:
SomeClass object;
const auto result = object.SomeMethod();
or const auto& result = object.SomeMethod();
Where SomeMethod() returns a non-primitive value - maybe another user-defined type.
My understanding is that const auto& result is correct since the result returned by SomeMethod() would call the copy constructor for the returned type. Please correct me if I am wrong.
What about for primitive types? I assume const auto sum = 1 + 2; is correct.
Does this also apply to range based for loops?
for(const auto& object : objects)

auto and auto && cover most of the cases:
Use auto when you need a local copy. This will never produce a reference. The copy (or move) constructor must exist, but it might not get called, due to the copy elision optimization.
Use auto && when you don't care if the object is local or not. Technically, this will always produce a reference, but if the initializer is a temporary (e.g., the function returns by value), it will behave essentially like your own local object.
Also, auto && doesn't guarantee that the object will be modifiable, either. Given a const object or reference, it will deduce const. However, modifiability is often assumed, given the specific context.
auto & and auto const & are a little more specific:
auto & guarantees that you are sharing the variable with something else. It is always a reference and never to a temporary.
auto const & is like auto &&, but provides read-only access.
What about for primitive/non-primitive types?
There is no difference.
Does this also apply to range based for loops?
Yes. Applying the above principles,
Use auto && for the ability to modify and discard values of the sequence within the loop. (That is, unless the container provides a read-only view, such as std::initializer_list, in which case it will be effectively an auto const &.)
Use auto & to modify the values of the sequence in a meaningful way.
Use auto const & for read-only access.
Use auto to work with (modifiable) copies.
You also mention auto const with no reference. This works, but it's not very commonly used because there is seldom an advantage to read-only access to something that you already own.

Yes, it is correct to use auto and auto& for local variables.
When getting the return type of a function, it is also correct to use auto&. This applies for range based for loops as well.
General rules for using auto are:
Choose auto x when you want to work with copies.
Choose auto &x when you want to work with original items and may modify them.
Choose auto const &x when you want to work with original items and will
not modify them.
You can read more about the auto specifier here.

auto uses the same mechanism of type deduction as templates, the only exception that I am aware of being that of brace-init lists, which are deduced by auto as std::initializer_list, but non-deduced in a template context.
auto x = expression;
works by first stripping all reference and cv qualifiers from the type of the right hand side expression, then matching the type. For example, if you have const int& f(){...} then auto x = f(); deduces x as int, and not const int&.
The other form,
auto& x = expression
does not strip the cv-qualifiers, so, using the example above, auto& x = f() deduces x as const int&. The other combinations just add cv qualifiers.
If you want your type to be always deduced with cv-ref qualifiers, use the infamous decltype(auto) in C++14, which uses the decltype type deduction rules.
So, in a nutshell, if you want copies, use auto, if you want references, use auto&. Use const whenever you want additional const-ness.
EDIT
There is an additional use case,
auto&& x = expression;
which uses the reference-collapsing rules, same as in the case of forwarding references in template code. If expression is a lvalue, then x is a lvalue reference with the cv-qualifiers of expression. If expression is a rvalue, then x is a rvalue reference.

When creating local variables, is it correct to use (const) auto& or auto?
Yes. The auto is nothing more than a compiler-deduced type, so use references where you would normally use references, and local (automatic) copies where you would normally use local copies. Whether or not to use a reference is independent of type deduction.
Where SomeMethod() returns a non-primitive value - maybe another user-defined type. My understanding is that const auto& result is correct since the result returned by SomeMethod() would call the copy constructor for the returned type. Please correct me if I am wrong.
Legal? Yes, with the const. Best practice? Probably not, no. At least, not with C++11. Especially not, if the value returned from SomeMethod() is already a temporary. You'll want to learn about C++11 move semantics, copy elision, and return value optimization:
https://juanchopanzacpp.wordpress.com/2014/05/11/want-speed-dont-always-pass-by-value/
http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=199
https://isocpp.org/wiki/faq/ctors#return-by-value-optimization
What about for primitive types? I assume const auto sum = 1 + 2; is correct.
Yes, this is fine.
Does this also apply to range based for loops?
for(const auto& object : objects)
Yes, this is also fine. I write this sort of code at work all the time.

Passing an element to a lambda by reference-to-const

Inside an algorithm, I want to create a lambda that accepts an element by reference-to-const:
template<typename Iterator>
void solve_world_hunger(Iterator it)
{
auto lambda = [](const decltype(*it)& x){
auto y = x; // this should work
x = x; // this should fail
};
}
The compiler does not like this code:
Error: »const«-qualifier cannot be applied to »int&« (translated manually from German)
Then I realized that decltype(*it) is already a reference, and of course those cannot be made const. If I remove the const, the code compiles, but I want x = x to fail.
Let us trust the programmer (which is me) for a minute and get rid of the const and the explicit &, which gets dropped due to reference collapsing rules, anyways. But wait, is decltype(*it) actually guaranteed to be a reference, or should I add the explicit & to be on the safe side?
If we do not trust the programmer, I can think two solutions to solve the problem:
(const typename std::remove_reference<decltype(*it)>::type& x)
(const typename std::iterator_traits<Iterator>::value_type& x)
You can decide for yourself which one is uglier. Ideally, I would want a solution that does not involve any template meta-programming, because my target audience has never heard of that before. So:
Question 1: Is decltype(*it)& always the same as decltype(*it)?
Question 2: How can I pass an element by reference-to-const without template meta-programming?

Question 1: no, the requirement on InputIterator is merely that *it is convertible to T (table 72, in "Iterator requirements").
So decltype(*it) could for example be const char& for an iterator whose value_type is int. Or it could be int. Or double.
Using iterator_traits is not equivalent to using decltype, decide which you want.
For the same reason, auto value = *it; does not necessarily give you a variable with the value type of the iterator.
Question 2: might depend what you mean by template meta-programming.
If using a traits type is TMP, then there's no way of specifying "const reference to the value type of an iterator" without TMP, because iterator_traits is the only means to access the value type of an arbitrary iterator.
If you want to const-ify the decltype then how about this?
template<typename Iterator>
void solve_world_hunger(Iterator it)
{
const auto ret_type = *it;
auto lambda = [](decltype(ret_type)& x){
auto y = x; // this should work
x = x; // this should fail
};
}
You might have to capture ret_type in order to use its type, I can't easily check at the moment.
Unfortunately it dereferences the iterator an extra time. You could probably write some clever code to avoid that, but the clever code would end up being an alternative version of remove_reference, hence TMP.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Can the use of C++11's 'auto' improve performance? - c++

I can see why the auto type in C++11 improves correctness and maintainability. I've read that it can also improve performance (Almost Always Auto by Herb Sutter), but I miss a good explanation. How can auto improve performance? Can anyone give an example?

Related

Perfectly transform a 2-tuple into a pair

Why my STL sort is not sorting the vector of strings? [duplicate]

structured bindings with std::minmax and rvalues

C++ auto& vs auto

Passing an element to a lambda by reference-to-const

Categories

Resources