How to choose between `push_*()` and `emplace_*()` functions?

How to choose between `push_*()` and `emplace_*()` functions? - c++

I understand the difference between the two function variants.
My question is: should I normally use good old push_*() version and only switch to emplace_*() when my profiler tells me this will benefit performance (that is, do not optimise prematurely)? Or should I switch to using emplace_*() as the default (perhaps not to pessimise the code unnecessarily - similar to i++ vs ++i in for loops)?
Is any of the variants more universal than the other (that is, imposes less constraints on the type being inserted) in realistic non-contrived use cases?

While writing the code I would not worry about performance. Performance is for later when you already have code that you can profile.
I'd rather worry about expressiveness of the code. Roughly speaking, push_back is for when you have an element and want to place a copy inside the container. emplace_back is to construct the element in place.
Consider what has the lower "wtf-count":
struct foo {int x;int y;};
void foo_add(const foo& f,std::vector<foo>& v) {
v.emplace_back(f); // wtf ?!? we already have a foo
v.push_back(f); // ... simply make a copy (or move)
}
void foo_add(int x, int y, std::vector<foo>& v) {
auto z = foo{x,y}; // wtf ?!?
f.push_back(z); // why create a temporary?
f.emplace_back(x,y); // ... simply construct it in place
}

emplace functions are delegating constructors.
Let's say you have a container of T.
If you already have a T, maybe it's const, maybe it's a rvalue, maybe none of those;
Then you use push_xxx().
Your object will be copied/moved into the container.
If you instead want to construct a T, then you use emplace_xxx(), with the same parameters you would send the constructor.
An object will be constructed directly in the container.

Emplace functions are more generic than push functions. In no case they are less efficient, on the contrary - they can be more efficient, as they allow to optimize away one copy/move operation of the container element when you need to construct it from arguments. When putting an element into container involves copy/move anyway, emplace and push operations are equivalent.
Push can be preferable, if you actually want to enforce construction before copying/moving the element into the container. For example, if your element type has some special logic in its constructor that you want to execute before the container is modified. Such cases are quite rare, though.

If you switch from push_back to emplace_back in a naive way, you will have no advantage at all. Consider the following code:
#include <iostream>
#include <string>
#include <vector>
struct President
{
std::string name;
std::string country;
int year;
President(std::string p_name, std::string p_country, int p_year) :
name(std::move(p_name)), country(std::move(p_country)), year(p_year)
{
std::cout << "I am being constructed.\n";
}
President(President&& other) :
name(std::move(other.name)), country(std::move(other.country)),
year(other.year)
{
std::cout << "I am being moved.\n";
}
President& operator=(const President& other) = default;
};
int main()
{
std::vector<President> elections;
std::cout << "emplace_back:\n";
elections.emplace_back("Nelson Mandela", "South Africa", 1994);
std::vector<President> reElections;
std::cout << "\npush_back:\n";
reElections.push_back(
President("Franklin Delano Roosevelt", "the USA", 1936));
std::cout << "\nContents:\n";
for (President const& president : elections)
{
std::cout << president.name << " was elected president of "
<< president.country << " in " << president.year << ".\n";
}
for (President const& president : reElections)
{
std::cout << president.name << " was re-elected president of "
<< president.country << " in " << president.year << ".\n";
}
}
If you replace push_back by emplace_back you still have a construction and then a move. Only if you pass the arguments needed for construction instead of the constructed instance itself (see the call to emplace_back), you have saved effort.

Related

Why this code does allow to push_back unique_ptr do vector?

so I thought adding unique to vector shouldn't work.
Why does it work for the below code?
Is it cause by not setting copy ctor as "deleted"??
#include <iostream>
#include <vector>
#include <memory>
class Test
{
public:
int i = 5;
};
int main()
{
std::vector<std::unique_ptr<Test>> tests;
tests.push_back(std::make_unique<Test>());
for (auto &test : tests)
{
std::cout << test->i << std::endl;
}
for (auto &test : tests)
{
std::cout << test->i << std::endl;
}
}

There is no copy here, only moves.
In this context, make_unique will produce an instance of unique pointer which is not named, and this push_back sees it as a r-value reference, which it can use as it wants.
It produce pretty much the same result than this code would:
std::vector<std::unique_ptr<Test>> tests;
auto ptr = std::make_unique<Test>();
tests.push_back(std::move(ptr));
This is called move semantics if you want to search more info on the matter. (and this only works from c++11 and beyond)

There are two overloads of std::vector::push_back according to https://en.cppreference.com/w/cpp/container/vector/push_back
In your case you will use the one with rvalue-ref so no copying required.

Does std::optional forwards rvalueness when contained object functions are called?

Little known feature of C++ is ref-qualifiers for member functions.
It works as I expect it to work in most cases, but it seems that std::optional does not forward the knowledge of its imminent demise to contained object member functions.
For example consider the following code:
#include <chrono>
#include <iostream>
#include <optional>
struct Noisy {
Noisy(const std::string& data): data_(data){
}
~Noisy(){
std::cout << "Goodbye" << std::endl;
}
std::string data_;
const std::string& data() const & {
std::cout << "returning data by ref" << std::endl;
return data_;
}
std::string data() && {
std::cout << "returning data by move" << std::endl;
return std::move(data_);
}
};
int main() {
for (const auto chr: Noisy{"Heeeeeeeeeeeeeeeeello wooooorld"}.data()){
std::cout << chr;
}
std::cout << std::endl;
for (const auto chr: std::optional<Noisy>{"Heeeeeeeeeeeeeeeeello wooooorld"}->data()){
std::cout << chr;
}
std::cout << std::endl;
}
output is:
returning data by move
Goodbye
Heeeeeeeeeeeeeeeeello wooooorld
returning data by ref
Goodbye
(crash in clang with sanitizer or garbage(UB))
I was hoping that temporary std::optional will be kind enough to call correct (data() &&) function, but it seems it does not happen.
Is this a language limitation, or std::optional just does not have correct machinery for this?
Full godbolt link.
note: my motivation is hacking around to see if I can be clever to enable safer usage of my classes in range based for loop, but realistically it is not worth the effort, this question is mostly about learning about language.

Overloaded operator arrow cannot do what you want; it terminates with a pointer always.
x->y is defined by the standard as (*x).y if and only if x is a pointer; otherwise it is (x.operator->())->y. This recursion only terminates if you hit a pointer.1
And there is no pointer to temporary type. Try this:
const auto chr: (*std::optional<Noisy>{"Heeeeeeeeeeeeeeeeello wooooorld"}).data()
Which does call the rvalue method. (via #largest_prime).
1 This recursion can also do Turing complete computation.

When to use an std::unique_ptr as a container?

Working on a game in Cocos2d-x. I have CCLayers* and lots of CCSprites* that are created. I add these CCSprites in a std::vector after I create them.
My concern is memory and deleting.
I am trying to wrap my head around std::unique_ptr. My understanding is that smart pointers will help and clean up memory and prevent leaks.
But I dont understand how to use it. Do I make a unique_ptr out of every CCSPrite*? Do I make a unique_ptr and put my whole vector in it?
Can anyone help me understand and give me an idea what to brush up on?

Wherever you use new currently, make sure the result is immediately go to ctor of a unique_ptr, or its reset() function. And that smart pointer is placed so it will live where needed. Or you may pass the controlled object ahead to a different instance. Or nuke it using reset().
Vectors you don't usually allocate with new, so they are not subject to smart pointering: the vector itself manages the memory for the content, you're ahead by that.

Simplistically unique_ptr<T> is a wrapper class for a member T* p. In unique_ptr::~unique_ptr it calls delete p. It has a deleted copy constructor so that you don't accidentally copy it (and hence cause a double deletion).
It has a few more features, but that is basically all it is.
If you are writing a performance-critical game, it is probably a better idea to manage memory manually with some sort of memory-pool architecture. That isn't to say that you can't use a vector<unique_ptr<T>> as part of that, just to say that you should plan out the lifetime of your dynamic objects first, and then decide what mechanism to use to delete them at the end of that lifetime.

Cocos2d-x objects have own reference counter, and they use autorelease pool. If you will use std::unique_ptr, you should manually remove created object from autorelease pool and than register it in unique_ptr. Prefer to use CCPointer: https://github.com/ivzave/cocos2dx-ext/blob/master/CCPointer.h

If you need a polymorphic container, that is a vector that can hold CCSprites or any derived class, then you can use a std::vector<std::unique_ptr<CCSprite>> to describe this and provide you with you with lifetime management of the classes.
#include <memory>
#include <vector>
#include <iostream>
using namespace std;
class Foo {
int m_i;
public:
Foo(int i_) : m_i(i_) { cout << "Foo " << m_i << " ctor" << endl; }
~Foo() { cout << "Foo " << m_i << " ~tor" << endl; }
};
class FooBar : public Foo {
public:
FooBar(int i_) : Foo(i_) { cout << "FooBar " << m_i << " ctor" << endl; }
~FooBar() { cout << "FooBar " << m_i << " ~tor" << endl; }
};
int main(int argc, const char** argv) {
vector<unique_ptr<Foo>> foos;
Foo foo(1);
foos.emplace_back(unique_ptr<Foo>(new Foo(2)));
cout << "foos size at end: " << foos.size() << endl;
return 0;
}
(I tried adding an example of a short scoped unique_ptr being added to the vector but it caused my GCC 4.7.3 to crash when testing)
Foo 1 ctor
Foo 2 ctor
foos size at end: 1
[<-- exit happens here]
Foo 1 dtor
Foo 2 dtor
If you don't need a polymorphic container, then you can avoid the memory management overhead by just having the vector directly contain the CCSprite objects. The disadvantage to this approach is that the address of given sprites can change if you add/remove elements. If the object is non-trivial this can quickly get very expensive:
std::vector<CCSprite> sprites;
sprites.emplace_back(/* args */);
CCSprite* const first = &sprites.front();
for (size_t i = 0; i < 128; ++i) {
sprites.emplace_back(/* args */);
}
assert(first == &sprites.front()); // probably fires.

In an STL Map of structs, why does the "[ ]" operator cause the struct's dtor to be invoked 2 extra times?

I've created a simple test case exhibiting a strange behavior I've noticed in a larger code base I'm working on. This test case is below. I'm relying on the STL Map's "[ ]" operator to create a pointer to a struct in a map of such structs. In the test case below, the line...
TestStruct *thisTestStruct = &testStructMap["test"];
...gets me the pointer (and creates a new entry in the map). The weird thing I've noticed is that this line not only causes a new entry in the map to be created (because of the "[ ]" operator), but for some reason it causes the struct's destructor to be called two extra times. I'm obviously missing something - any help is much appreciated!
Thanks!
#include <iostream>
#include <string>
#include <map>
using namespace std;
struct TestStruct;
int main (int argc, char * const argv[]) {
map<string, TestStruct> testStructMap;
std::cout << "Marker One\n";
//why does this line cause "~TestStruct()" to be invoked twice?
TestStruct *thisTestStruct = &testStructMap["test"];
std::cout << "Marker Two\n";
return 0;
}
struct TestStruct{
TestStruct(){
std::cout << "TestStruct Constructor!\n";
}
~TestStruct(){
std::cout << "TestStruct Destructor!\n";
}
};
the code above outputs the following...
/*
Marker One
TestStruct Constructor! //makes sense
TestStruct Destructor! //<---why?
TestStruct Destructor! //<---god why?
Marker Two
TestStruct Destructor! //makes sense
*/
...but I don't understand what causes the first two invocations of TestStruct's destructor?
(I think the last destructor invocation makes sense because testStructMap is going out of scope.)

The functionality of std::map<>::operator[] is equivalent to
(*((std::map<>::insert(std::make_pair(x, T()))).first)).second
expression, as specified in the language specification. This, as you can see, involves default-constructing a temporary object of type T, copying it into a std::pair object, which is later copied (again) into the new element of the map (assuming it wasn't there already). Obviously, this will produce a few intermediate T objects. Destruction of these intermediate objects is what you observe in your experiment. You miss their construction, since you don't generate any feedback from copy-constructor of your class.
The exact number of intermediate objects might depend on compiler optimization capabilities, so the results may vary.

You have some unseen copies being made:
#include <iostream>
#include <string>
#include <map>
using namespace std;
struct TestStruct;
int main (int argc, char * const argv[]) {
map<string, TestStruct> testStructMap;
std::cout << "Marker One\n";
//why does this line cause "~TestStruct()" to be invoked twice?
TestStruct *thisTestStruct = &testStructMap["test"];
std::cout << "Marker Two\n";
return 0;
}
struct TestStruct{
TestStruct(){
std::cout << "TestStruct Constructor!\n";
}
TestStruct( TestStruct const& other) {
std::cout << "TestStruct copy Constructor!\n";
}
TestStruct& operator=( TestStruct const& rhs) {
std::cout << "TestStruct copy assignment!\n";
}
~TestStruct(){
std::cout << "TestStruct Destructor!\n";
}
};
Results in:
Marker One
TestStruct Constructor!
TestStruct copy Constructor!
TestStruct copy Constructor!
TestStruct Destructor!
TestStruct Destructor!
Marker Two
TestStruct Destructor!

add the following to TestStruct's interface:
TestStruct(const TestStruct& other) {
std::cout << "TestStruct Copy Constructor!\n";
}

Your two mysterious destructor calls are probably paired with copy constructor calls going on somewhere within the std::map. For example, it's conceivable that operator[] default-constructs a temporary TestStruct object, and then copy-constructs it into the proper location in the map. The reason that there are two destructor calls (and thus probably two copy constructor calls) is implementation-specific, and will depend on your compiler and standard library implementation.

operator[] inserts to the map if there is not already an element there.
What you are missing is output for the compiler-supplied copy constructor in your TestStruct, which is used during container housekeeping. Add that output, and it should all make more sense.
EDIT: Andrey's answer prompted me to take a look at the source in Microsoft VC++ 10's <map>, which is something you could also do to follow this through in all its gory detail. You can see the insert() call to which he refers.
mapped_type& operator[](const key_type& _Keyval)
{ // find element matching _Keyval or insert with default mapped
iterator _Where = this->lower_bound(_Keyval);
if (_Where == this->end()
|| this->comp(_Keyval, this->_Key(_Where._Mynode())))
_Where = this->insert(_Where,
value_type(_Keyval, mapped_type()));
return ((*_Where).second);
}

so the lesson is - dont put structs in a map if you care about their lifecycles. Use pointers, or even better shared_ptrs to them

You can check it out through this more simple code.
#include <iostream>
#include <map>
using namespace std;
class AA
{
public:
AA() { cout << "default const" << endl; }
AA(int a):x(a) { cout << "user const" << endl; }
AA(const AA& a) { cout << "default copy const" << endl; }
~AA() { cout << "dest" << endl; }
private:
int x;
};
int main ()
{
AA o1(1);
std::map<char,AA> mymap;
mymap['x']=o1; // (1)
return 0;
}
The below result shows that (1) line code above makes (1 default const) and (2 default copy const) calls.
user const
default const // here
default copy const // here
default copy const // here
dest
dest
dest
dest

Efficient push_back of classes and structs

I've seen my colleague do the second snippet quite often. Why is this? I've tried adding print statements to track the ctors and dtors, but both seem identical.
std::vector<ClassTest> vecClass1;
ClassTest ct1;
ct1.blah = blah // set some stuff
...
vecClass1.push_back(ct1);
std::vector<ClassTest> vecClass2;
vecClass2.push_back(ClassTest());
ClassTest& ct2 = vecClass2.back();
ct2.blah = blah // set some stuff
...
PS. I'm sorry if the title is misleading.
Edit:
Firstly, thank you all for your responses.
I've written a small application using std::move. The results are surprising to me perhaps because I've done something wrong ... would someone please explain why the "fast" path is performing significantly better.
#include <vector>
#include <string>
#include <boost/progress.hpp>
#include <iostream>
const std::size_t SIZE = 10*100*100*100;
//const std::size_t SIZE = 1;
const bool log = (SIZE == 1);
struct SomeType {
std::string who;
std::string bio;
SomeType() {
if (log) std::cout << "SomeType()" << std::endl;
}
SomeType(const SomeType& other) {
if (log) std::cout << "SomeType(const SomeType&)" << std::endl;
//this->who.swap(other.who);
//this->bio.swap(other.bio);
this->who = other.who;
this->bio = other.bio;
}
SomeType& operator=(SomeType& other) {
if (log) std::cout << "SomeType::operator=()" << std::endl;
this->who.swap(other.who);
this->bio.swap(other.bio);
return *this;
}
~SomeType() {
if (log) std::cout << "~SomeType()" << std::endl;
}
void swap(SomeType& other) {
if (log) std::cout << "Swapping" << std::endl;
this->who.swap(other.who);
this->bio.swap(other.bio);
}
// move semantics
SomeType(SomeType&& other) :
who(std::move(other.who))
, bio(std::move(other.bio)) {
if (log) std::cout << "SomeType(SomeType&&)" << std::endl;
}
SomeType& operator=(SomeType&& other) {
if (log) std::cout << "SomeType::operator=(SomeType&&)" << std::endl;
this->who = std::move(other.who);
this->bio = std::move(other.bio);
return *this;
}
};
int main(int argc, char** argv) {
{
boost::progress_timer time_taken;
std::vector<SomeType> store;
std::cout << "Timing \"slow\" path" << std::endl;
for (std::size_t i = 0; i < SIZE; ++i) {
SomeType some;
some.who = "bruce banner the hulk";
some.bio = "you do not want to see me angry";
//store.push_back(SomeType());
//store.back().swap(some);
store.push_back(std::move(some));
}
}
{
boost::progress_timer time_taken;
std::vector<SomeType> store;
std::cout << "Timing \"fast\" path" << std::endl;
for (std::size_t i = 0; i < SIZE; ++i) {
store.push_back(SomeType());
SomeType& some = store.back();
some.who = "bruce banner the hulk";
some.bio = "you do not want to see me angry";
}
}
return 0;
}
Output:
dev#ubuntu-10:~/Desktop/perf_test$ g++ -Wall -O3 push_back-test.cpp -std=c++0x
dev#ubuntu-10:~/Desktop/perf_test$ ./a.out
Timing "slow" path
3.36 s
Timing "fast" path
3.08 s

If the object is more expensive to copy after "set some stuff" than before, then the copy that happens when you insert the object into the vector will be less expensive if you insert the object before you "set some stuff" than after.
Really, though, since you should expect objects in a vector to be copied occasionally, this is probably not much of an optimization.

If we accept that your colleague's snippet is wise, because ClassTest is expensive to copy, I would prefer:
using std::swap;
std::vector<ClassTest> vecClass1;
ClassTest ct1;
ct1.blah = blah // set some stuff
...
vecClass1.push_back(ClassTest());
swap(ct1, vecClass1.back());
I think it's clearer, and it may well be more exception-safe. The ... code presumably allocates resources and hence could throw an exception (or else what's making the fully-built ClassTest so expensive to copy?). So unless the vector really is local to the function, I don't think it's a good idea for it to be half-built while running that code.
Of course this is even more expensive if ClassTest only has the default swap implementation, but if ClassTest doesn't have an efficient swap, then it has no business being expensive to copy. So this trick perhaps should only be used with classes known to be friendly, rather than unknown template parameter types.
As Gene says, std::move is better anyway, if you have that C++0x feature.
If we're worried about ClassTest being expensive to copy, though, then relocating the vector is a terrifying prospect. So we should also either:
reserve enough space before adding anything,
use a deque instead of a vector.

The second version benefits from moving the temporary. The first version is copying the temporary vector. So the second one is potentially faster. The second version has also potentially smaller peak memory requirements, the first version creates two objects one temporary and one copy of it and only then deletes the temporary. You can improve the first version by explicitly moving the temporary:
std::vector<ClassTest> vecClass1;
ClassTest ct1;
ct1.blah = blah // set some stuff
...
vecClass1.push_back(std::move(ct1));

You should probably ask your collegue to know exactly why, but we can still take a guess. As James pointed out, it might be a tad more efficient if the object is more expensive to copy once constructed.
I see advantages in both versions.
I like your collegue's snippet because: although there are 2 objects in both cases, they only co-exist for a very short period of time in the second version. There is only one object available for editing: this avoids the potential error of editing ct1 after push_back.
I like your personal snippet because: invoking push_back to add a second object potentially invalidates the reference ct2, inducing a risk of undefined behavior. The first snippet does not present this risk.

They are identical (as far as I can see). Maybe he or she does that as an idiomatic custom.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to choose between `push_()` and `emplace_()` functions? - c++

Related

Why this code does allow to push_back unique_ptr do vector?

Does std::optional forwards rvalueness when contained object functions are called?

When to use an std::unique_ptr as a container?

In an STL Map of structs, why does the "[ ]" operator cause the struct's dtor to be invoked 2 extra times?

Efficient push_back of classes and structs

Categories

Resources