STL efficiency when using big objects

STL efficiency when using big objects - c++

To be more specific let's restrict the scope of the question to libstdc++ and Visual C++.
Consider the case when objects stored in a container have the following properties:
Copying and assignment can be expensive
Moving and swapping is cheap and never throws
Default constructor is cheap and never throws
Some of the containers may and will reallocate/move stored objects when elements are added or removed. In that case do the above mentioned STL implementations avoid copying when reallocating/moving elements?
What about std::sort and other algorithms?
If you think about it there is no need for copying when moving and swapping is available.
As you may know all STL operation provide Big O complexity guaranties. Big O means there is constant multiplied by some function of N. My question could be paraphrased by asking what does that constant include? Does it include the cost of copying or it is proportional to the cost of moving/swapping?
Thanks you.

The only general answer that can be given is that C++ is made by smart people that care a lot about performance so you usually won't find easy optimizations missed out and shouldn't worry too much about your standard library's performance.
You can answer these question type-by-type and function-by-function by reading the specification in the standard, websites like cppreference.com or the documentation that comes with your implementation. For example, if std::vector::push_back has to re-allocate its internal buffer, it will use the move constructor to “copy” over the elements if and only if such constructor exists and is declared noexcept (also see std::move_if_noexcept).
A different approach to reason about what is actually going on inside your standard library is taking it for a test drive. Instrument a simple struct to print out logging messages from its constructors and assignment operators, then put instances of that class into a standard library container and exercise some algorithm on it. The following example uses std::vector and std::sort. You can play with it by using different containers and algorithms. Also see what's happening if you make the changes indicated by the comments.
#include <algorithm>
#include <iomanip>
#include <iostream>
#include <random>
#include <vector>
struct Example
{
int id;
Example(const int id) : id {id}
{
std::cout << __PRETTY_FUNCTION__ << std::endl;
}
Example(const Example& rhs) : id {rhs.id}
{
std::cout << __PRETTY_FUNCTION__ << std::endl;
}
// try commenting out the 'noexcept'
Example(Example&& rhs) noexcept : id {rhs.id}
{
std::cout << __PRETTY_FUNCTION__ << std::endl;
}
Example&
operator=(const Example& rhs)
{
this->id = rhs.id;
std::cout << __PRETTY_FUNCTION__ << std::endl;
return *this;
}
// try commenting out the 'noexcept'
Example&
operator=(Example&& rhs) noexcept
{
this->id = rhs.id;
std::cout << __PRETTY_FUNCTION__ << std::endl;
return *this;
}
~Example() noexcept
{
std::cout << __PRETTY_FUNCTION__ << std::endl;
}
};
int
main()
{
const auto n = 10;
auto rndeng = std::default_random_engine {};
auto rnddst = std::uniform_int_distribution<int> {};
auto elements = std::vector<Example> {};
std::cout << "CONSTRUCTING VECTOR OF " << n << " ELEMENTS...\n\n";
elements.reserve(n); // try commenting this out
for (auto i = 0; i < n; ++i)
elements.emplace_back(rnddst(rndeng)); // try using push_back instead
const auto cmp = [](const Example& lhs, const Example& rhs){
return lhs.id < rhs.id;
};
std::cout << "\nSORTING ELEMENTS...\n\n";
std::sort(elements.begin(), elements.end(), cmp);
std::cout << "\nSORTED ELEMENTS:\n\n";
for (const auto& elem : elements)
std::cout << std::setw(16) << elem.id << "\n";
std::cout << "\nLEAVING MAIN...\n\n";
}

Related

C++ Map Initialize object using non-default constructor

In C++ suppose I have an unordered map defined as follows:
unordered_map<int, MyClass> my_map;
auto my_class = my_map[1];
In the above code if 1 is not present as key in my_map it will initialize MyClass with default constructor and return. But is there a way to use non-default constructor of MyClass for initialization?

You're right that operator[] needs the value type to be default-constructible.
insert does not:
std::unordered_map<int, MyClass> my_map;
// Populate the map here
// Get element with key "1", creating a new one
// from the given value if it doesn't already exist
auto result = my_map.insert({1, <your value here>});
This gives you a pair containing an iterator to the element (whether created new, or already present), and a boolean (telling you which was the case).
So:
auto& my_class = *result.first;
const bool was_inserted = result.second;
Now you can do whatever you like with this information. Often you won't even care about result.second and can just ignore it.
For more complex value types you can play around with emplace, which is like insert but, um, better. Say you really don't want the value to be constructed if it won't be used, and you have C++17:
auto result = my_map.try_emplace(1, <your value's ctor args here here>);
If you don't care (or don't have C++17):
auto result = my_map.emplace(1, <your value>);
This is still better than insert as it can move the value into the map, rather than copying it.
Ultimately, and if you don't even want to unnecessarily produce your ctor args, you can always just do a find first, but it's nice to try to avoid that, as the insertion operation itself will be doing a find too.

Imagine a struct T:
struct T {
int i1, i2;
// no default constructor
explicit T(int i1, int i2): i1(i1), i2(i2) { }
};
With a default constructor it's quite easy:
aMap[123] = T(1, 23);
The operator[] grants that a non-existing entry is created on demand (but for this it needs the default constructor of the mapped type).
If the class of mapped_type doesn't provide a default constructor OP's intention can be matched by a simple combination of std::unordered_map::find() and std::unordered_map::insert() (or just only the latter with check of success).
(This part was inserted later as A Lightness Races in Orbit pointed out that I skipped this simple solution and directly moved to the more complicated.) He wrote an alternative answer concerning this. As it is lacking a demonstrational MCVE, I took mine and adapted it:
#include <iostream>
#include <unordered_map>
struct T {
int i1, i2;
// no default constructor
explicit T(int i1, int i2): i1(i1), i2(i2)
{
std::cout << "T::T(" << i1 << ", " << i2 << ")\n";
}
};
int main()
{
typedef std::unordered_map<int, T> Map;
Map aMap;
//aMap[123] = T(1, 23); doesn't work without default constructor.
for (int i = 0; i < 2; ++i) {
Map::key_type key = 123;
Map::iterator iter = aMap.find(key);
if (iter == aMap.end()) {
std::pair<Map::iterator, bool> ret
= aMap.insert(Map::value_type(key, T(1 + i, 23)));
if (ret.second) std::cout << "Insertion done.\n";
else std::cout << "Insertion failed! Key " << key << " already there.\n";
} else {
std::cout << "Key " << key << " found.\n";
}
}
for (const auto &entry : aMap) {
std::cout << entry.first << " -> (" << entry.second.i1 << ", " << entry.second.i2 << ")\n";
}
return 0;
}
Output:
T::T(1, 23)
Insertion done.
Key 123 found.
123 -> (1, 23)
Live Demo on coliru
If the mapped type does lack a copy constructor as well then it's still solvable using std::unordered_map::emplace() (again with or without pre-check with std::unordered_map::find()):
aMap.emplace(std::piecewise_construct,
std::forward_as_tuple(123),
std::forward_as_tuple(1, 23));
The adapted sample:
#include <iostream>
#include <unordered_map>
struct T {
int i1, i2;
// no default constructor
explicit T(int i1, int i2): i1(i1), i2(i2)
{
std::cout << "T::T(" << i1 << ", " << i2 << ")\n";
}
// copy constructor and copy assignment disabled
T(const T&) = delete;
T& operator=(const T&);
};
int main()
{
typedef std::unordered_map<int, T> Map;
Map aMap;
for (int i = 0; i < 2; ++i) {
Map::key_type key = 123;
Map::iterator iter = aMap.find(key);
if (iter == aMap.end()) {
std::pair<Map::iterator, bool> ret
= aMap.emplace(std::piecewise_construct,
std::forward_as_tuple(key),
std::forward_as_tuple(1 + i, 23));
if (ret.second) std::cout << "Insertion done.\n";
else std::cout << "Insertion failed! Key " << key << " already there.\n";
} else {
std::cout << "Key " << key << " found.\n";
}
}
for (const auto &entry : aMap) {
std::cout << entry.first << " -> (" << entry.second.i1 << ", " << entry.second.i2 << ")\n";
}
return 0;
}
Output:
T::T(1, 23)
Insertion done.
Key 123 found.
123 -> (1, 23)
Live Demo on coliru
As Aconcagua mentioned in comment, without the pre-checking find(), the emplace() might construct the mapped value even if the insertion will fail.
The doc. of `std::unordered_map::emplace() on cppreference mentions this:
The element may be constructed even if there already is an element with the key in the container, in which case the newly constructed element will be destroyed immediately.
As Jarod42 mentioned, std::unordered_map::try_emplace() is an alternative in C++17 worth to be mentioned as
Unlike insert or emplace, these functions do not move from rvalue arguments if the insertion does not happen, which makes it easy to manipulate maps whose values are move-only types, such as std::unordered_map<std::string, std::unique_ptr<foo>>. In addition, try_emplace treats the key and the arguments to the mapped_type separately, unlike emplace, which requires the arguments to construct a value_type (that is, a std::pair)

[] implements get_add_if_missing. Semantically, an overhead-free implementation would be something like:
value_type& get_add_if_missing(key_type const& k, auto&& factory) {
auto b = bucket_for(k);
auto pos = pos_for(k, b);
if (pos == b.end()) {
return b.append(k, factory());
} else {
return *pos;
}
}
A full equivalent is not there on the API yet (as of C++17), so for now, you need to decide what suboptimality to have based on how expensive it is to creating a temporary value_type:
do an extra lookup (search then insert if missing)
extra temporary (insert/emplace always, covered well in other answers)
An extra lookup version is:
final itr = m.find(key);
if (itr == m.end()) {
// insert or emplace a new element with the constructor of your choice
}
The std::unordered_map article on cppreference should have enough usage examples for insert / emplace.
With a tree-based implementation (std::map) a zero overhead get_add_if_missing emulation is quite possible with lower_bound followed by a hint-enabled insert / emplace.
And finally the good news -- if you can accept Boost.Intrusive (a header-only library) as a dependency, you can build a truly zero-overhead get_add_if_missing (no temporaries or repeated hash calculation). An API of the hash map from there is sufficiently detailed for that.

Erase Method in Vector

I'm writing some codes in which there are 2 vectors containing 4 smart pointers respectively. I accidentally apply an iterator generated in the first vector to the erase method in the second vector. Then the program crashes. I learn that the copy construction and the move construction get involved the erase method. In light of the debugger, I figure out 1) a nullptr and 2 smart pointers stay in 1st vector. 2) 4 smart pointers reside in 2nd vector. 3) the program starts to crash after several successful run. My questions are as follows,
how is the nullptr appended to 1st vector?
Why is it permissible to apply the iterator to 2nd vector?
Why does the program not crash from the outset?
BTW, my platform is Xcode 8.1. Thanks in advance
#include <memory>
#include <vector>
#include <iostream>
#include <string>
using namespace std;
class A{
public:
A(string name_) : name(name_) {cout << name << " construction\n";}
const string& get_name() const {return name;}
~A() {cout <<get_name() << " destruction\n";}
A (const A& rhs) : name(rhs.name){cout << "A copy constructor\n";}
A(A&& rhs) : name(""){
cout <<"A move constructor\n";
swap(rhs);
}
void swap(A& rhs) noexcept {
std::swap(name, rhs.name);
}
private:
string name;
};
void foo();
int main(){
foo();
}
void foo(){
vector<shared_ptr<A>> vect1, vect2;
auto a1 = make_shared<A>("Mike");
auto a2 = make_shared<A>("Alice");
auto a3 = make_shared<A>("Peter");
auto a4 = make_shared<A>("Paul");
vect1.push_back(a1);
vect1.push_back(a2);
vect1.push_back(a3);
vect1.push_back(a4);
vect2.push_back(a4);
vect2.push_back(a1);
vect2.push_back(a2);
vect2.push_back(a3);
auto it = vect1.begin();
vect1.erase(it);
for (auto &c : vect1){
cout << c->get_name() << endl;
}
vect2.erase(it);
for (auto &c : vect2){
cout << c->get_name() << endl;
}
}

In VS2015 it fails on line vect2.erase(it); with the message; Iterator out of bounds which indeed it is.
As you state in the question it doesn't even belong to vect2.
Even if it works on your platform it is undefined behavior. So from then on anything can happen.
You're not working within the c++ standard anymore, you're now dealing with whichever way your platform was implemented (maybe it uses pointers for iterators, maybe it uses offsets; who knows?).

Force use of copy constructor / Avoid use of copy constructor

I'm currently writing a logging class (just for practice) and ran into an issue. I have two classes: The class Buffer acts as a temporary buffer and flushes itself in it's destructor. And the class Proxy that returns a Buffer instance, so I don't have to write Buffer() all the time.
Anyways, here is the code:
#include <iomanip>
#include <iostream>
#include <sstream>
#include <string>
class Buffer
{
private:
std::stringstream buf;
public:
Buffer(){};
template <typename T>
Buffer(const T& v)
{
buf << v;
std::cout << "Constructor called\n";
};
~Buffer()
{
std::cout << "HEADER: " << buf.str() << "\n";
}
Buffer(const Buffer& b)
{
std::cout << "Copy-constructor called\n";
// How to get rid of this?
};
Buffer(Buffer&&) = default;
Buffer& operator=(const Buffer&) & = delete;
Buffer& operator=(Buffer&&) & = delete;
template <typename T>
Buffer& operator<<(const T& v)
{
buf << v;
return *this;
}
};
class Proxy
{
public:
Proxy(){};
~Proxy(){};
Proxy(const Proxy&) = delete;
Proxy(Proxy&&) = delete;
Proxy& operator=(const Proxy&) & = delete;
Proxy& operator=(Proxy&&) & = delete;
template <typename T>
Buffer operator<<(const T& v) const
{
if(v < 0)
return Buffer();
else
return Buffer(v);
}
};
int main () {
Buffer(Buffer() << "Test") << "what";
Buffer() << "This " << "works " << "fine";
const Proxy pr;
pr << "This " << "doesn't " << "use the copy-constructor";
pr << "This is a " << std::setw(10) << " test";
return 0;
}
Here is the output:
Copy-constructor called
HEADER: what
HEADER: Test
HEADER: This works fine
Constructor called
HEADER: This doesn't use the copy-constructor
Constructor called
HEADER: This is a test
The code does exactly what I want but it depends on RVO. I read multiple times that you should not rely on RVO so I wanted to ask how I can:
Avoid RVO completely so that the copy constructor is called every time
Avoid the copy constructor
I already tried to avoid the copy constructor by returning a reference or moving but that segfaults. I guess thats because the temporary in Proxy::operator<< is deleted during return.
I'd also be interested in completely different approaches that do roughly the same.

This seems like a contrived problem: Firstly, the code works whether RVO is enabled or disabled (you can test it by using G++ with the no-elide-constructors flag). Secondly, the way you are designing the return of a Buffer object for use with the << operator can only be done by copying†: The Proxy::operator<<(const T& v) function creates a new Buffer instance on the stack, which is then deleted when you leave the function call (i.e. between each concatenation in pr << "This " << "doesn't " << "use the copy-constructor";); This is why you get a segmentation fault when trying to reference this object from outside the function.
Alternatively, you could define a << operator to use dynamic memory by e.g. returning a unique_ptr<Buffer>:
#include <memory>
...
std::unique_ptr<Buffer> operator<<(const T& v) const
{
if(v < 0)
return std::unique_ptr<Buffer>(new Buffer());
else
return std::unique_ptr<Buffer>(new Buffer(v));
}
However, your original concatenation statements won't be compilable, then, because Proxy::operator<<(const T& v) now returns an object of type std::unique_ptr<Buffer> rather than Buffer, meaning that this returned object doesn't have its own Proxy::operator<<(const T& v) function defined and so multiple concatenations won't work without first explicitly de-referencing the returned pointer:
const Proxy pr;
std::unique_ptr<Buffer> pb = pr << "This ";
// pb << "doesn't " << "use the copy-constructor"; // This line doesn't work
*pb << "doesn't " << "use the copy-constructor";
In other words, your classes rely inherently on copying and so, if you really want to avoid copying, you should throw them away and completely re-design your logging functionalities.
† I'm sure there's some black-magic voodoo which can be invoked to make this possible --- albeit at the cost of one's sanity.

which range loop(auto, auto&, const auto&) is more efficient in c++11

// Example program
#include <iostream>
#include <ostream>
#include <string>
#include <vector>
class One
{
public:
One(int age, int price)
: m_age(age), m_price(price)
{
std::cout << "m_age: " << m_age << " , m_price: " << m_price << std::endl;
}
One(const One&) = default;
One& operator=(const One&) = default;
int age() const { return m_age; }
int price() const { return m_price; }
private:
int m_age;
int m_price;
};
std::ostream& operator<<(std::ostream& os, const One& one)
{
os << "<< m_age: " << one.age() << " , m_price: " << one.price();
return os;
}
int main()
{
std::vector<One> vecOnes = {{1, 2}, {3, 4}};
//for(auto it: vecOnes) // case I
//for(auto& it: vecOnes) // case II
for(const auto& it: vecOnes) // case III
{
std::cout << it << std::endl;
}
}
All three cases output the same results as follows:
m_age: 1 , m_price: 2
m_age: 3 , m_price: 4
<< m_age: 1 , m_price: 2
<< m_age: 3 , m_price: 4
Question> which case is more efficient way to use auto?
Originally, I expect the auto will trigger the constructor of class One. But it doesn't show that way based on the output results.

Originally, I expect the auto will trigger the constructor of class One. But it doesn't show that way based on the output results.
It does trigger a constructor: the copy constructor. But you didn't instrument that one† so you don't see any output. The other two cases don't construct a new object, so will definitely be more efficient.
Note that there is also a fourth case for (auto&& it : vecOnes) {...} It will be equivalent to your second case here and also not create any new objects.
†Well now that you edited your question, it should be pretty clear that the one case does construct new objects and the others all do not.

The reality is, it depends...
If you've just got a vector of a built-in type, such as an int then you might as well use auto it as it's cheap to make a copy and you'll save a dereference every time you use the value. For classes this will invoke the copy constructor which may affect performance.
However, if you're vector is contains a class then in general it'll be more efficient to use auto &it or const auto &it in order to save creating a copy of the object. There's no cost advantage to using const over non-const, it just comes down to how you want to interact with the object.

Prevent reassignment of a reference?

Consider that in some library somewhere (which we have no access to change), we have a Counter class:
class Counter {
int count;
public:
Counter() : count(0) { }
void bump() { ++count; }
int getCount() const { return count; }
};
which, by its very nature, is mutable. If it's const, it's pretty worthless.
And in our code, we "use" that Counter. Badly.
#include <string>
#include <iostream>
#include <Counter.hpp>
using std::cout;
using std::endl;
void breakTheHellOutOfCounter(Counter &c) {
// This is OK
c.bump();
// Oh noes!
c = Counter();
}
int main() {
Counter c;
c.bump(); c.bump(); c.bump();
std::cout << "Count was " << c.getCount() << std::endl;
breakTheHellOutOfCounter(c);
std::cout << "Count is now " << c.getCount() << std::endl;
}
Note that breakTheHellOutOfCounter overwrites main's counter with a shiny new one, resetting the count. That's going to cause the caller some grief. (Imagine something a lot more harmful happening, and you'll see where I'm going here.)
I need to be able to bump c (and thus, I need it mutable), but I want breakTheHellOutOfCounter() to fail miserably due to trying to replace c. Is there a way I can change things (other than the Counter class) to make that happen?
(I'm aware that at the lowest levels, this is all but impossible to enforce. What I want is a way to make it hard to do accidentally.)

The cleanest solution I can see to this without modifying counter itself is something like:
#include <string>
#include <iostream>
#include <Counter.hpp>
template <typename T>
struct Unbreakable : public T {
Unbreakable<T>& operator=(const Unbreakable<T>&) = delete;
Unbreakable<T>& operator=(Unbreakable<T>&&) = delete;
template <typename ...Args>
Unbreakable(Args&& ...args) : T(std::forward<Args>(args)...) {}
};
using std::cout;
using std::endl;
void breakTheHellOutOfCounter(Unbreakable<Counter> &c) {
// this is ok
c.bump();
// oh noes!
c = Counter();
}
int main() {
Unbreakable<Counter> c;
c.bump(); c.bump(); c.bump();
std::cout << "Count was " << c.getCount() << std::endl;
breakTheHellOutOfCounter(c);
std::cout << "Count is now " << c.getCount() << std::endl;
}
Which correctly gives an error from your "oh noes" line. (Example uses C++11, but C++98 solution is similar)
That doesn't rule out usage like:
Counter& br = c;
br = Counter();
of course, but without modifying Counter itself I don't think that's avoidable.

The simplest way to do this is to remove the assignment operator from the Counter class. However, since you don't have the ability to change the Counter class, your only real option is to wrap the Counter class in a class with no assignment operator and use that instead.

As Michael Anderson said, you can wrap your counter object in a class that prevents assignment.
class CounterProxy {
Counter& counter;
CounterProxy & operator=(const CounterProxy&);
public:
CounterProxy(Counter& c) : counter(c) {}
void bump() { counter.bump(); }
int getCount() const { return counter.getCount(); }
};
void breakTheHellOutOfCounter(CounterProxy &c) {
// this is ok
c.bump();
// not oh noes!
c = CounterProxy(Counter());
}
int main() {
Counter c;
c.bump(); c.bump(); c.bump();
std::cout << "Count was " << c.getCount() << std::endl;
breakTheHellOutOfCounter(CounterProxy(c));
std::cout << "Count is now " << c.getCount() << std::endl;
}
You can use this method whenever you want to limit the operations that can be performed on an object.
EDIT: You're probably already aware of this and looking for a more elegant solution, but the code might help others.

By allowing bump via a mutable reference, you are giving the function access to mess with the object state. There is nothing special about assignment; it's just a function that mutates the object in some way. It could just as well be a function called CopyStateFromAnotherInstance() instead of operator =().
So the real problem is: How do you allow only certain functions but hide others? By using an interface:
class IBumpable
{
void bump() ...
};
class Counter : IBumpable
{
....
};
void functionThatCannotBreakCounter(IBumpable& counter) { ... }

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

STL efficiency when using big objects - c++

Related

C++ Map Initialize object using non-default constructor

Erase Method in Vector

Force use of copy constructor / Avoid use of copy constructor

which range loop(auto, auto&, const auto&) is more efficient in c++11

Prevent reassignment of a reference?

Categories

Resources