What's difference between those two std::vector's assign methods? - c++

There are two ways (that I know) of assigning one vector to another:
std::vector<std:string> vectorOne, vectorTwo;
// fill vectorOne with strings
// First assign method
vectorTwo = vectorOne;
// Second assign method
vectorTwo.assign( vectorOne.begin(), vectorOne.end() );
Is there really difference within those methods or they are equal in terms of efficiency and safety when performed on very big vectors?

They're pretty much equivalent. The reason for the second is
that you might have types which need (implicit) conversion:
std::vector<int> vi;
std::vector<double> vd;
// ...
vd.assign( vi.begin(), vi.end() );
Or the type of the container might be different:
vd.assign( std::istream_iterator<double>( file ),
std::istream_iterator<double>() );
If you know that both of the containers are the same type, just
use assignment. It has the advantage of only using a single
reference to the source, and possibly allowing move semantics in
C++11.

The second form is generic, it works with any iterator types, and just copies the elements from the source vector.
The first form only works with exactly the same type of vector, it copies the elements and in C++11 might also replace the allocator by copying the allocator from the source vector.
In your example the types are identical, and use std::allocator which is stateless, so there is no difference. You should use the first form because it's simpler and easier to read.

They are equivalent in this case. [and C++03 standaerd]. The difference will be however if vectorTwo contains elements before assignment. Then
vectorTwo = vectorOne; // use operator=
// Any elements held in the container before the call
// are either assigned to or destroyed.
vectorTwo.assign() // any elements held in the container
// before the call are destroyed and replaced by newly
// constructed elements (no assignments of elements take place).
assign is needed because operator= takes single right-hand operand so assign is used when there is a need for a default argument value or range of values. What assign does could be done indirectly by first creating suitable vector and then assigning that:
void f(vector<Book>& v, list<Book>& l){
vector<Book> vt = (l.begin(), l.end());
v = vt;
}
however this can be both ugly and inefficient (example has been taken from Bjarne Stroustrup "The C++...")
Also note that if vector is not of the same type then there is also need for assign which allows implicit conversion:
vector<int> vi;
vector<double> vd;
// ...
vd.assign( vi.begin(), vi.end() );

Related

Is it possible to create an STL container where the contents are mutable, but the container's attributes aren't?

This is probably easiest to explain with examples.
std::vector<int> a;
a.push_back(1);
a[0] = 2;
const std::vector<int>& b = a;
b.push_back(1); // Throws an error I want.
b = std::vector<int>(); // Throws an error I want.
b[0] = 2; // Throws an error I don't want.
This is sort of similar to an array declared like:
char* const ptr = array;
where here the values can be modified, but the pointer to the array can't be changed.
The basic reason for this is to be able to allocate and initialize data in one module, while passing it by reference and having it processed elsewhere. I want to indicate that the underlying allocations shouldn't be manipulated.
This could sort of be achieved for the containers with a data() function by just passing this data pointer instead of the container. Also, if I were to write my own class, I suppose I could achieve something like this with mutable or const_casts, but I was curious if there's any STL feature I was missing.
No, none of the standard library containers offer that behavior. You will need to write your own container or adapt one from the standard library ones to have this feature.
However, if you do not want to allow modifying the containers themselves (only their elements), then consider passing the range of the elements contained in the container instead the container itself to your users.
For example (pre-C++20) pass a.begin() and a.end() as an iterator pair denoting the range of elements to the user. The user doesn't need to care about where the iterators originate from if they only need to access the elements in the range and they can't modify the container itself through the iterators.
Since C++20 pass the range of elements as some form of view range. For example for contiguous containers, such as std::vector, pass as a std::span constructed from the container. std::span basically encapsulates passing .data() and .size() of such containers, as you also suggested.

C++ - std::initializer_list vs std::span

What is the difference between std::initializer_list and std::span? Both are contiguous sequences of values of some type. Both are non-owning.
So when do we use the first and when do we use the latter?
The short answer is that std::initializer_list<T> is used to create a new range, for the purposes of initialization. While std::span<T> is used to refer to existing ranges, for better APIs.
std::initializer_list<T> is a language feature that actually constructs a new array and owns it. It solves the problem of how to conveniently initialize containers:
template <typename T>
struct vector {
vector(std::initializer_list<T>);
};
vector<int> v = {1, 2, 3, 4};
That creates a std::initializer_list<int> on the fly containing the four integers there, and passes it into vector so that it can construct itself properly.
This is really the only place std::initializer_list<T> should be used: either constructors or function parameters to pass a range in on the fly, for convenience (unit tests are a common place that desires such convenience).
std::span<T> on the other hand is used to refer to an existing range. Its job is to replace functions of the form:
void do_something(int*, size_t);
with
void do_something(std::span<int>);
Which makes such functions generally easier to use and safer. std::span is constructible from any appropriate contiguous range, so taking our example from earlier:
std::vector<int> v = {1, 2, 3, 4};
do_something(v); // ok
It can also be used to replace functions of the form:
void do_something_else(std::vector<unsigned char> const&);
Which can only be called with, specifically, a vector, with the more general:
void do_something_else(std::span<unsigned char const>);
Which can be called with any backing contiguous storage over unsigned char.
With span you have to be careful, since it's basically a reference that just doesn't get spelled with a &, but it is an extremely useful type.
The principle differences between the two are how the language treats them. Or more specifically, how it doesn't in the case of a span.
You cannot create an initializer_list from an existing container object or array. They can only be created by copying from other initializer_lists or from a braced-init-list (ie: { stuff }). That is, the compiler governs the creation of an initializer_list and the array it points into.
Similarly, a constructor that takes only an initializer_list has special meaning to the compiler. When performing list initialization on that type, such constructors are given priority in overload resolution. span is given no special meaning by the compiler.
initializer_list also has lifetime rules that are different from span. The lifetime of the array pointed to by an initializer_list is the lifetime of the initializer_list object that was created from a braced-init-list. This is not true of span; the lifetime of a span created from a container is whatever you say it is based on how you use it in code.
Broadly speaking, you should only use initializer_list when you're initializing an object.
One difference is that std::span can be used dynamically, unlike std::initializer_list.

std::map - adding element using subscript operator Vs insert method

I am trying to understand and make sure if three different ways to insert elements into a std::map are effectively the same.
std::map<int, char> mymap;
Just after declaring mymap - will inserting an element with value a for key 10 be same by these three methods?
mymap[10]='a';
mymap.insert(mymap.end(), std::make_pair(10, 'a'));
mymap.insert(std::make_pair(10, 'a'));
Especially, does it make any sense using mymap.end() when there is no existing element in std::map?
The main difference is that (1) first default-constructs a key object in the map in order to be able to return a reference to this object. This enables you to assign something to it.
Keep that in mind if you are working with types that are stored in a map, but have no default constructor. Example:
struct A {
explicit A(int) {};
};
std::map<int, A> m;
m[10] = A(42); // Error! A has no default ctor
m.insert(std::make_pair(10, A(42))); // Ok
m.insert(m.end(), std::make_pair(10, A(42))); // Ok
The other notable difference is that (as #PeteBecker pointed out in the comments) (1) overwrites existing entries in the map, while (2) and (3) don't.
Yes, they are effectively the same. Just after declaring mymap, all three methods turn mymap into {10, 'a'}.
It is OK to use mymap.end() when there is no existing element in std::map. In this case, begin() == end(), which is the universal way of denoting an empty container.
(1) is different from (2) and (3) if there exists an element with the same key. (1) will replace the element, where (2) and (3) will fail and return value denoting insertion didn't happen.
(1) also requires that mapped type is default constructible. In fact (1) first default constructs the object if not present already and replaces that with the value specified.
(2) and (3) are also different. To understand the difference we need to understand what the iterator in (2) does. From cppreference, the iterator refers to a hint where insertion happens as close to that hint as possible. There is a performance difference depending on the validity of the hint. Quoting from the same page:
Amortized constant if the insertion happens in the position just after the hint, logarithmic in the size of the container otherwise.(until C++11)
Amortized constant if the insertion happens in the position just before the hint, logarithmic in the size of the container otherwise. (since C++11)
So for large maps we can get a performance boost if we already know the position somehow.
Having said all of these, if the map is just created and you are doing the operation with no prior elements in the map as you said in the question then I would say that all three will be practically same (though there internal operation will be different as specified above).

Elegant way to create a vector of reference

I have a vector of Foo
vector<Foo> inputs
Foo is a struct with some score inside
struct Foo {
...
float score
bool winner
}
Now I want to sort inputs by score and only assign winner to the top 3. But I don't want to change the original inputs vector. So I guess I need to create a vector of reference then sort that? Is it legal to create a vector of reference? Is there an elegant way to do so?
Here two different way of creating a vector<Foo*>:
vector<Foo*> foor;
for (auto& x:inputs)
foor.push_back(&x);
vector<Foo*> foob(inputs.size(),nullptr);
transform(inputs.begin(), inputs.end(), foob.begin(), [](auto&x) {return &x;});
You can then use standard algorithms to sort your vectors of pointers without changing the original vector (if this is a requirement):
// decreasing order according to score
sort(foob.begin(), foob.end(), [](Foo*a, Foo*b)->bool {return a->score>b->score;});
You may finally change the top n elements, either using for_each_n() algorithm (if C++17) or simply with an ordinary loop.
Online demo
The only example code given was for pointers, and the IMO far more fitting std::reference_wrapper was only mentioned, with no indication of how it might be used in a situation like this. I want to fix that!
Non-owning pointers have at least 3 drawbacks:
the visual, from having to pepper &, *, and -> in code using them;
the practical: if all you want is a reference to one object, now you have a thing that can be subtracted from other pointers (which may not be related), be inc/decremented (if not const), do stuff in overload resolution or conversion, etc. – none of which you want. I'm sure everyone is laughing at this and saying 'I'd never make such silly mistakes', but you know in your gut that, on a long enough timeline, it will happen.
and the lack of self-documentation, as they have no innate semantics of ownership or lack thereof.
I typically prefer std::reference_wrapper, which
clearly self-documents its purely observational semantics,
can only yield a reference to an object, thus not having any pointer-like pitfalls, and
sidesteps many syntactical problems by implicitly converting to the real referred type, thus minimising operator noise where you can invoke conversion (pass to a function, initialise a reference, range-for, etc.)... albeit interfering with the modern preference for auto – at least until we get the proposed operator. or operator auto – and requiring the more verbose .get() in other cases or if you just want to avoid such inconsistencies. Still, I argue that these wrinkles are neither worse than those of pointers, nor likely to be permanent given various active proposals to prettify use of wrapper/proxy types.
I'd recommend that or another vocabulary class, especially for publicly exposed data. There are experimental proposal(s) for observer_ptrs and whatnot, but again, if you don't really want pointer-like behaviour, then you should be using a wrapper that models a reference... and we already have one of those.
So... the code in the accepted answer can be rewritten like so (now with #includes and my preferences for formatting):
#include <algorithm>
#include <functional>
#include <vector>
// ...
void
modify_top_n(std::vector<Foo>& v, int const n)
{
std::vector< std::reference_wrapper<Foo> > tmp{ v.begin(), v.end() };
std::nth_element( tmp.begin(), tmp.begin() + n, tmp.end(),
[](Foo const& f1, Foo const& f2){ return f1.score > f2.score; } );
std::for_each( tmp.begin(), tmp.begin() + n,
[](Foo& f){ f.winner = true; } );
}
This makes use of the range constructor to construct a range of reference_wrappers from the range of real Foos, and the implicit conversion to Foo& in the lambda argument lists to avoid having to do reference_wrapper.get() (and then we have the far less messy direct member access by . instead of ->).
Of course, this can be generalised: the main candidate for factoring out to a reusable helper function is the construction of a vector< reference_wrapper<Foo> > for arbitrary Foo, given only a pair of iterators-to-Foo. But we always have to leave something as an exercise to the reader. :P
If you really don't want to modify the original vector, then you'll have to sort a vector of pointers or indices into the original vector instead. To answer part of your question, no there's no way to make a vector of references and you shouldn't do so.
To find the top three (or n) elements, you don't even have to sort the whole vector. The STL's got you covered with std::nth_element (or std::partial_sort if you care about the order of the top elements), you would do something like this:
void modify_top_n(std::vector<Foo> &v, int n) {
std::vector<Foo*> tmp(v.size());
std::transform(v.begin(), v.end(), tmp.begin(), [](Foo &f) { return &f; });
std::nth_element(tmp.begin(), tmp.begin() + n, tmp.end(),
[](const Foo* f1, const Foo *f2) { return f1->score > f2->score; });
std::for_each(tmp.begin(), tmp.begin() + n, [](Foo *f) {
f->winner = true;
});
}
Assuming the vector has at least n entries. I used for_each just because it's easier when you have an iterator range, you can use a for loop as well (or for_each_n as Christophe mentioned, if you have C++17).
Answering the question on it's face value:
Vectors of references (as well as built-in arrays of them) are not legal in C++. Here is normative standard wording for arrays:
There shall be no references to references, no arrays of references,
and no pointers to references.
And for vectors it is forbidden by the fact that vector elements must be assignable (while references are not).
To have an array or vector of indirect objects, one can either use a non-owning pointer (std::vector<int*>), or, if a non-pointer access syntax is desired, a wrapper - std::reference_wrapper.
So I guess I need to create a vector of reference then sort that? Is it legal to create a vector of reference?
No, it is not possible to have a vector of references. There is std::reference_wrapper for such purpose, or you can use a bare pointer.
Besides the two ways shown by Christophe, one more way is a transform iterator adaptor, which can be used to sort the top 3 pointers / reference wrappers into an array using std::partial_sort_copy.
A transform iterator simply adapts an output iterator by calling a function to transform input upon assignment. There are no iterator adaptors in the standard library though, so you need to implement one yourself, or use a library.

Emplace empty vector into std::map()

How can I emplace an empty vector into a std::map? For example, if I have a std::map<int, std::vector<int>>, and I want map[4] to contain an empty std::vector<int>, what can I call?
If you use operator[](const Key&), the map will automatically emplace a value-initialized (i.e. in the case of std::vector, default-constructed) value if you access an element that does not exist. See here:
http://en.cppreference.com/w/cpp/container/map/operator_at
(Since C++ 11 the details are a tad more complicated, but in your case this is what matters).
That means if your map is empty and you do map[4], it will readily give you a reference to an empty (default-constructed) vector. Assigning an empty vector is unnecessary, although it may make your intent more clear.
Demo: https://godbolt.org/g/rnfW7g
Unfortunately the strictly-correct answer is indeed to use std::piecewise_construct as the first argument, followed by two tuples. The first represents the arguments to create the key (4), and the second represents the arguments to create the vector (empty argument set).
It would look like this:
map.emplace(std::piecewise_construct, // signal piecewise construction
std::make_tuple(4), // key constructed from int(4)
std::make_tuple()); // value is default constructed
Of course this looks unsightly, and other alternatives will work. They may even generate no more code in an optimised build:
This one notionally invokes default-construction and move-assignment, but it is likely that the optimiser will see through it.
map.emplace(4, std::vector<int>());
This one invokes default-construction followed by copy-assignment. But again, the optimiser may well see through it.
map[4] = {};
To ensure an empty vector is placed at position 4, you may simply attempt to clear the vector at position 4.
std::map<int, std::vector<int>> my_map;
my_map[4].clear();
As others have mentioned, the indexing operator for std::map will construct an empty value at the specified index if none already exists. If that is the case, calling clear is redundant. However, if a std::vector<int> does already exist, the call to clear serves to, well, clear the vector there, resulting in an empty vector.
This may be more efficient than my previous approach of assigning to {} (see below), because we probably plan on adding elements to the vector at position 4, and we don't pay any cost of new allocation this way. Additionally, if previous usage of my_map[4] indicates future usage, then our new vector will likely be eventually resized to the nearly the same size as before, meaning we save on reallocation costs.
Previous approach:
just assign to {} and the container should properly construct an empty vector there:
std::map<int, std::vector<int>> my_map;
my_map[4] = {};
std::cout << my_map.size() << std::endl; // prints 1
Demo
Edit: As Jodocus mentions, if you know that the std::map doesn't already contain a vector at position 4, then simply attempting to access the vector at that position will default-construct one, e.g.:
std::map<int, std::vector<int>> my_map;
my_map[4]; // default-constructs a vector there
What's wrong with the simplest possible solution? std::map[4] = {};.
In modern C++, this should do what you want with no or at least, very little, overhead.
If you must use emplace, the best solution I can come up with is this:
std::map<int, std::vector<int>> map;
map.emplace(4, std::vector<int>());
Use piecewise_construct with std::make_tuple:
map.emplace(std::piecewise_construct, std::make_tuple(4), std::make_tuple());
We are inserting an empty vector at position 4.
And if there is a general case like, emplacing a vector of size 100 with 10 filled up then:
map.emplace(std::piecewise_construct, std::make_tuple(4), std::make_tuple(100, 10));
piecewise_construct: This constant value is passed as the first argument to construct a pair object to select the constructor form that constructs its members in place by forwarding the elements of two tuple objects to their respective constructor.