Why would I ever use push_back instead of emplace_back?

Why would I ever use push_back instead of emplace_back? - c++

C++11 vectors have the new function emplace_back. Unlike push_back, which relies on compiler optimizations to avoid copies, emplace_back uses perfect forwarding to send the arguments directly to the constructor to create an object in-place. It seems to me that emplace_back does everything push_back can do, but some of the time it will do it better (but never worse).
What reason do I have to use push_back?

I have thought about this question quite a bit over the past four years. I have come to the conclusion that most explanations about push_back vs. emplace_back miss the full picture.
Last year, I gave a presentation at C++Now on Type Deduction in C++14. I start talking about push_back vs. emplace_back at 13:49, but there is useful information that provides some supporting evidence prior to that.
The real primary difference has to do with implicit vs. explicit constructors. Consider the case where we have a single argument that we want to pass to push_back or emplace_back.
std::vector<T> v;
v.push_back(x);
v.emplace_back(x);
After your optimizing compiler gets its hands on this, there is no difference between these two statements in terms of generated code. The traditional wisdom is that push_back will construct a temporary object, which will then get moved into v whereas emplace_back will forward the argument along and construct it directly in place with no copies or moves. This may be true based on the code as written in standard libraries, but it makes the mistaken assumption that the optimizing compiler's job is to generate the code you wrote. The optimizing compiler's job is actually to generate the code you would have written if you were an expert on platform-specific optimizations and did not care about maintainability, just performance.
The actual difference between these two statements is that the more powerful emplace_back will call any type of constructor out there, whereas the more cautious push_back will call only constructors that are implicit. Implicit constructors are supposed to be safe. If you can implicitly construct a U from a T, you are saying that U can hold all of the information in T with no loss. It is safe in pretty much any situation to pass a T and no one will mind if you make it a U instead. A good example of an implicit constructor is the conversion from std::uint32_t to std::uint64_t. A bad example of an implicit conversion is double to std::uint8_t.
We want to be cautious in our programming. We do not want to use powerful features because the more powerful the feature, the easier it is to accidentally do something incorrect or unexpected. If you intend to call explicit constructors, then you need the power of emplace_back. If you want to call only implicit constructors, stick with the safety of push_back.
An example
std::vector<std::unique_ptr<T>> v;
T a;
v.emplace_back(std::addressof(a)); // compiles
v.push_back(std::addressof(a)); // fails to compile
std::unique_ptr<T> has an explicit constructor from T *. Because emplace_back can call explicit constructors, passing a non-owning pointer compiles just fine. However, when v goes out of scope, the destructor will attempt to call delete on that pointer, which was not allocated by new because it is just a stack object. This leads to undefined behavior.
This is not just invented code. This was a real production bug I encountered. The code was std::vector<T *>, but it owned the contents. As part of the migration to C++11, I correctly changed T * to std::unique_ptr<T> to indicate that the vector owned its memory. However, I was basing these changes off my understanding in 2012, during which I thought "emplace_back does everything push_back can do and more, so why would I ever use push_back?", so I also changed the push_back to emplace_back.
Had I instead left the code as using the safer push_back, I would have instantly caught this long-standing bug and it would have been viewed as a success of upgrading to C++11. Instead, I masked the bug and didn't find it until months later.

push_back always allows the use of uniform initialization, which I'm very fond of. For instance:
struct aggregate {
int foo;
int bar;
};
std::vector<aggregate> v;
v.push_back({ 42, 121 });
On the other hand, v.emplace_back({ 42, 121 }); will not work.

Backwards compatibility with pre-C++11 compilers.

Some library implementations of emplace_back do not behave as specified in the C++ standard including the version that ship with Visual Studio 2012, 2013 and 2015.
In order to accommodate known compiler bugs, prefer usingstd::vector::push_back() if the parameters reference iterators or other objects which will be invalid after the call.
std::vector<int> v;
v.emplace_back(123);
v.emplace_back(v[0]); // Produces incorrect results in some compilers
On one compiler, v contains the values 123 and 21 instead of the expected 123 and 123. This is due to the fact that the 2nd call to emplace_back results in a resize at which point v[0] becomes invalid.
A working implementation of the above code would use push_back() instead of emplace_back() as follows:
std::vector<int> v;
v.emplace_back(123);
v.push_back(v[0]);
Note: The use of a vector of ints is for demonstration purposes. I discovered this issue with a much more complex class which included dynamically allocated member variables and the call to emplace_back() resulted in a hard crash.

Use push_back only for primitive/built-in types or raw pointers. Otherwise use emplace_back.

Consider what happens in Visual Studio 2019 with c++-17 compiler. We have emplace_back in a function with proper arguments set up. Then someone changes parameters of the constuctor called by emplace_back. There is no warning whatsover in VS, the code also compiles fine, then it crashes in runtime. I removed all emplace_back from the codebase after this.

Related

Why use std::make_unique in C++17?

As far as I understand, C++14 introduced std::make_unique because, as a result of the parameter evaluation order not being specified, this was unsafe:
f(std::unique_ptr<MyClass>(new MyClass(param)), g()); // Syntax A
(Explanation: if the evaluation first allocates the memory for the raw pointer, then calls g() and an exception is thrown before the std::unique_ptr construction, then the memory is leaked.)
Calling std::make_unique was a way to constrain the call order, thus making things safe:
f(std::make_unique<MyClass>(param), g()); // Syntax B
Since then, C++17 has clarified the evaluation order, making Syntax A safe too, so here's my question: is there still a reason to use std::make_unique over std::unique_ptr's constructor in C++17? Can you give some examples?
As of now, the only reason I can imagine is that it allows to type MyClass only once (assuming you don't need to rely on polymorphism with std::unique_ptr<Base>(new Derived(param))). However, that seems like a pretty weak reason, especially when std::make_unique doesn't allow to specify a deleter while std::unique_ptr's constructor does.
And just to be clear, I'm not advocating in favor of removing std::make_unique from the Standard Library (keeping it makes sense at least for backward compatibility), but rather wondering if there are still situations in which it is strongly preferred to std::unique_ptr

You're right that the main reason was removed. There are still the don't use new guidelines and that it is less typing reasons (don't have to repeat the type or use the word new). Admittedly those aren't strong arguments but I really like not seeing new in my code.
Also don't forget about consistency. You absolutely should be using make_shared so using make_unique is natural and fits the pattern. It's then trivial to change std::make_unique<MyClass>(param) to std::make_shared<MyClass>(param) (or the reverse) where the syntax A requires much more of a rewrite.

make_unique distinguishes T from T[] and T[N], unique_ptr(new ...) does not.
You can easily get undefined behaviour by passing a pointer that was new[]ed to a unique_ptr<T>, or by passing a pointer that was newed to a unique_ptr<T[]>.

The reason is to have shorter code without duplicates. Compare
f(std::unique_ptr<MyClass>(new MyClass(param)), g());
f(std::make_unique<MyClass>(param), g());
You save MyClass, new and braces. It costs only one character more in make in comparison with ptr.

Every use of new has to be extra carefully audited for lifetime correctness; does it get deleted? Only once?
Every use of make_unique doesn't for those extra characteristics; so long as the owning object has "correct" lifetime, it recursively makes the unique pointer have "correct".
Now, it is true that unique_ptr<Foo>(new Foo()) is identical in all ways1 to make_unique<Foo>(); it just requires a simpler "grep your source code for all uses of new to audit them".
1 actually a lie in the general case. Perfect forwarding isn't perfect, {}, default init, arrays are all exceptions.

Since then, C++17 has clarified the evaluation order, making Syntax A safe too
That's really not good enough. Relying on a recently-introduced technical clause as the guarantee of safety is not a very robust practice:
Someone might compile this code in C++14.
You would be encouraging the use of raw new's elsewhere, e.g. by copy-pasting your example.
As S.M. suggests, since there's code duplication, one type might get changed without the other one being changed.
Some kind of automatic IDE refactoring might move that new elsewhere (ok, granted, not much chance of that).
Generally, it's a good idea for your code to be appropriate/robust/clearly valid without resorting to language-laywering, looking up minor or obscure technical clauses in the standard.
(this is essentially the same argument I made here about the order of tuple destruction.)

Consider
void function(std::unique_ptr(new A()), std::unique_ptr(new B())) { ... }
Suppose that new A() succeeds, but new B() throws an exception: you catch it to resume the normal execution of your program. Unfortunately, the C++ standard does not require that object A gets destroyed and its memory deallocated: memory silently leaks and there's no way to clean it up. By wrapping A and B into std::make_uniques you are sure the leak will not occur:
void function(std::make_unique(), std::make_unique()) { ... }
The point here is that std::make_unique and std::make_unique are now temporary objects, and cleanup of temporary objects is correctly specified in the C++ standard: their destructors will be triggered and the memory freed. So if you can, always prefer to allocate objects using std::make_unique and std::make_shared.

Moving (not copying) part of a vector to another vector using C++98

In C++11 we can do the following:
std::vector<int> A(n);
std::vector<int> B;
// I want to move part of A (from A[start] to A[end]) to B
std::move(A.begin()+start,A.begin()+end,std::back_inserter(B));
// in C++98 there is no move operator, so use copy (assign)
B.assign(A.begin()+start, A.begin()+end);
Is there any way in C++98 to do something similar to C++11 implementation.
The goal is to save time and use move instead of copying the vector, do you have any idea?
In the real code, I am not using int, It is vector of object.

Unfortunately, move semantics are not available in C++98, (Since they were introduced in C++11), but for the example given, moving the integers is not much different from copying them, so the actual performance difference should be negligible, if there is any.

While move semantics didn't exist in C++98, you can always write something that does the equivalent yourself for this situation.
A quick and easy way to do it would be to just introduce a "moveTo" function on your class that you plan to work with, and then just call that with the reference/pointer of where you want to move it to.
It's worth highlighting that I have said "on your class"; because as Arnav Borborah said, your example is too trivial to make a difference; but I highlight this option in case it's a poor example.

Filling containers in functions avoiding segmentation faults

I come from Java, where filling containers did not involve thinking. My problem now that i work with c++ is, that filling a container in a function with data which is declared in function scope can lead to errors, where the data doesn't exist any more when I want to access it.
I could not find tutorials addressing the problem, so I went the Java way and only made containers getting pointers declared with "new". But now I am forced to return a
std::list<Vertex<float> >
from a function and thought this might be a good point to learn how I would fill and return such a thing. Would this
{
std::list<Vertex<float> > myList;
Vertex<float> v(0.0, 0.1, 0.2);
myList.push_back(v);
myList.push_back(Vertex<float>(1,0, 1.1, 1.2));
return myList;
}
actually be fine as a sample function body? And if yes, why would v still exist outside of the scope? Does each insertion in a container also imply a copying?

This would work "fine" since every operation there creates a copy.
myList.push_back(v); creates a copy of v so that visiblity of v is now irrelevant.
return myList; returns a copy of the list to the calling function so the visiblity of myList is now irrelevant. The calling function should make a copy of this list to keep it in scope, else it will be destroyed at the end of execution of the line that calls this function.
The reason that fine is quoted is that copies are typically expensive. In your case they are quite small so it might be irrelevant, and in many cases they might be optimised away, but it is still something to keep in mind.
Old C++ way of optimizing is to pass a list by reference and use that to construct your list, instead of returning by value.
void MakeMeAList(std::list<Vertex<float> >& aList){
....
}
std::list<Vertex<float> > aList;
MakeMeAList(aList);
As #billz suggests, Return Value Optimization should optimize away the copies even if this is not done.
New C++ (c++11) -
The use of emplace_back to construct the list, would be more efficient that copying, as long as the input variables are not going to be used any more. ( thanks #Troy)
My C++11 is weak, I am almost sure even returning by value is OK since Move semantics would optimize it away, but I am only 95% sure.

To add to Karthik's reply, if you're using a relatively new compiler that implements r-value references (part of C++0x standard), your approach will work perfectly without the negative impact of the copy operation.
Check out http://en.wikipedia.org/wiki/C%2B%2B11#Rvalue_references_and_move_constructors
for a brief intro to r-values.
Note that before r-value references were adopted, many compilers eliminated the 'expensive copy operation' involved in returning collections through what is called Return Value Optimization. Again Wiki has more details:
http://en.wikipedia.org/wiki/Return_value_optimization
Visual Studio 2005 (and newer) implements RVO, and I believe GCC also has a similar feature. So effectively, your code, which is more readable than using arguments to return values, is the correct approach and ought to work very well if you're using the latest compilers.

Which C++ idioms are deprecated in C++11?

With the new standard, there are new ways of doing things, and many are nicer than the old ways, but the old way is still fine. It's also clear that the new standard doesn't officially deprecate very much, for backward compatibility reasons. So the question that remains is:
What old ways of coding are definitely inferior to C++11 styles, and what can we now do instead?
In answering this, you may skip the obvious things like "use auto variables".

Final Class: C++11 provides the final specifier to prevent class derivation
C++11 lambdas substantially reduce the need for named function object (functor) classes.
Move Constructor: The magical ways in which std::auto_ptr works are no longer needed due to first-class support for rvalue references.
Safe bool: This was mentioned earlier. Explicit operators of C++11 obviate this very common C++03 idiom.
Shrink-to-fit: Many C++11 STL containers provide a shrink_to_fit() member function, which should eliminate the need swapping with a temporary.
Temporary Base Class: Some old C++ libraries use this rather complex idiom. With move semantics it's no longer needed.
Type Safe Enum Enumerations are very safe in C++11.
Prohibiting heap allocation: The = delete syntax is a much more direct way of saying that a particular functionality is explicitly denied. This is applicable to preventing heap allocation (i.e., =delete for member operator new), preventing copies, assignment, etc.
Templated typedef: Alias templates in C++11 reduce the need for simple templated typedefs. However, complex type generators still need meta functions.
Some numerical compile-time computations, such as Fibonacci can be easily replaced using generalized constant expressions
result_of: Uses of class template result_of should be replaced with decltype. I think result_of uses decltype when it is available.
In-class member initializers save typing for default initialization of non-static members with default values.
In new C++11 code NULL should be redefined as nullptr, but see STL's talk to learn why they decided against it.
Expression template fanatics are delighted to have the trailing return type function syntax in C++11. No more 30-line long return types!
I think I'll stop there!

At one point in time it was argued that one should return by const value instead of just by value:
const A foo();
^^^^^
This was mostly harmless in C++98/03, and may have even caught a few bugs that looked like:
foo() = a;
But returning by const is contraindicated in C++11 because it inhibits move semantics:
A a = foo(); // foo will copy into a instead of move into it
So just relax and code:
A foo(); // return by non-const value

As soon as you can abandon 0 and NULL in favor of nullptr, do so!
In non-generic code the use of 0 or NULL is not such a big deal. But as soon as you start passing around null pointer constants in generic code the situation quickly changes. When you pass 0 to a template<class T> func(T) T gets deduced as an int and not as a null pointer constant. And it can not be converted back to a null pointer constant after that. This cascades into a quagmire of problems that simply do not exist if the universe used only nullptr.
C++11 does not deprecate 0 and NULL as null pointer constants. But you should code as if it did.

Safe bool idiom → explicit operator bool().
Private copy constructors (boost::noncopyable) → X(const X&) = delete
Simulating final class with private destructor and virtual inheritance → class X final

One of the things that just make you avoid writing basic algorithms in C++11 is the availability of lambdas in combination with the algorithms provided by the standard library.
I'm using those now and it's incredible how often you just tell what you want to do by using count_if(), for_each() or other algorithms instead of having to write the damn loops again.
Once you're using a C++11 compiler with a complete C++11 standard library, you have no good excuse anymore to not use standard algorithms to build your's. Lambda just kill it.
Why?
In practice (after having used this way of writing algorithms myself) it feels far easier to read something that is built with straightforward words meaning what is done than with some loops that you have to uncrypt to know the meaning. That said, making lambda arguments automatically deduced would help a lot making the syntax more easily comparable to a raw loop.
Basically, reading algorithms made with standard algorithms are far easier as words hiding the implementation details of the loops.
I'm guessing only higher level algorithms have to be thought about now that we have lower level algorithms to build on.

You'll need to implement custom versions of swap less often. In C++03, an efficient non-throwing swap is often necessary to avoid costly and throwing copies, and since std::swap uses two copies, swap often has to be customized. In C++, std::swap uses move, and so the focus shifts on implementing efficient and non-throwing move constructors and move assignment operators. Since for these the default is often just fine, this will be much less work than in C++03.
Generally it's hard to predict which idioms will be used since they are created through experience. We can expect an "Effective C++11" maybe next year, and a "C++11 Coding Standards" only in three years because the necessary experience isn't there yet.

I do not know the name for it, but C++03 code often used the following construct as a replacement for missing move assignment:
std::map<Big, Bigger> createBigMap(); // returns by value
void example ()
{
std::map<Big, Bigger> map;
// ... some code using map
createBigMap().swap(map); // cheap swap
}
This avoided any copying due to copy elision combined with the swap above.

When I noticed that a compiler using the C++11 standard no longer faults the following code:
std::vector<std::vector<int>> a;
for supposedly containing operator>>, I began to dance. In the earlier versions one would have to do
std::vector<std::vector<int> > a;
To make matters worse, if you ever had to debug this, you know how horrendous are the error messages that come out of this.
I, however, do not know if this was "obvious" to you.

Return by value is no longer a problem. With move semantics and/or return value optimization (compiler dependent) coding functions are more natural with no overhead or cost (most of the time).

std::auto_ptr to std::unique_ptr

With the new standard coming (and parts already available in some compilers), the new type std::unique_ptr is supposed to be a replacement for std::auto_ptr.
Does their usage exactly overlap (so I can do a global find/replace on my code (not that I would do this, but if I did)) or should I be aware of some differences that are not apparent from reading the documentation?
Also if it is a direct replacement, why give it a new name rather than just improve the std::auto_ptr?

You cannot do a global find/replace because you can copy an auto_ptr (with known consequences), but a unique_ptr can only be moved. Anything that looks like
std::auto_ptr<int> p(new int);
std::auto_ptr<int> p2 = p;
will have to become at least like this
std::unique_ptr<int> p(new int);
std::unique_ptr<int> p2 = std::move(p);
As for other differences, unique_ptr can handle arrays correctly (it will call delete[], while auto_ptr will attempt to call delete.

std::auto_ptr and std::unique_ptr are incompatible in someways and a drop in replacement in others. So, no find/replace isn't good enough. However, after a find/replace working through the compile errors should fix everything except weird corner cases. Most of the compile errors will require adding a std::move.
Function scope variable:
100% compatible, as long as you don't pass it by value to another function.
Return type:
not 100% compatible but 99% compatible doesn't seem wrong.
Function parameter by value:
100% compatible with one caveat, unique_ptrs must be passed through a std::move call. This one is simple as the compiler will complain if you don't get it right.
Function parameter by reference:
100% compatible.
Class member variable:
This one is tricky. std::auto_ptrs copy semantics are evil. If the class disallows copying then std::unique_ptr is a drop in replacement. However, if you tried to give the class reasonable copy semantics, you'll need to change the std::auto_ptr handling code. This is simple as the compiler will complain if you don't get it right. If you allowed copying of a class with a std::auto_ptr member without any special code, then shame on you and good luck.
In summary, std::unique_ptr is an unbroken std::auto_ptr. It disallows at compile time behaviors that were often errors when using a std::auto_ptr. So if you used std::auto_ptr with the care it needed, switching to std::unique_ptr should be simple. If you relied on std::auto_ptr's odd behavior, then you need to refactor your code anyway.

AFAIK, unique_ptr is not a direct replacement. The major flaw that it fixes is the implicit transfer of ownership.
std::auto_ptr<int> a(new int(10)), b;
b = a; //implicitly transfers ownership
std::unique_ptr<int> a(new int(10)), b;
b = std::move(a); //ownership must be transferred explicitly
On the other hand, unique_ptr will have completely new capabilities: they can be stored in containers.

Herb Sutter has a nice explanation on GotW #89:
What’s the deal with auto_ptr? auto_ptr is most charitably characterized as a valiant attempt to create a unique_ptr before C++
had move semantics. auto_ptr is now deprecated, and should not be used
in new code.
If you have auto_ptr in an existing code base, when you get a chance
try doing a global search-and-replace of auto_ptr to unique_ptr; the
vast majority of uses will work the same, and it might expose (as a
compile-time error) or fix (silently) a bug or two you didn't know you
had.
In other words, while a global search-and-replace may "break" your code temporarily, you should do it anyway: It may take some time to fix the compile errors, but will save you a lot more trouble in the long run.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js