Confusion with C++ STL container [] operator and default values - c++

Consider the following code:
unordered_map<string, vector<string>> hashtable;
string s = "foo";
hashtable[s].push_back("bar");
This seems to work, but that means that in the third line, it is both adding a new entry to the hashtable by initializing a vector of strings at key "foo" as well as adding "bar" to this empty vector. My confusion is how come we don't have to explicitly initialize a vector like:
unordered_map<string, vector<string>> hashtable;
string s = "foo";
vector<string> vec;
vec.push_back("bar");
hashtable[s] = vec;
Adding to my confusion is when we are dealing with stuff like initializing arrays in C++, it is good to explicitly initialize the array like so:
int array[10] = {0);
This is required if we want to make sure the array is initialized with all values being 0 since without it, there could be garbage values stored in memory at the same place the array was initialized at. Going back to my first question with the hashtable, how do we know
hashtable[s].push_back("bar");
isn't pushing "bar" into a vector with garbage values?
I realize my question is not clear at all. Any clarifications with behaviour of [] operator and default values of STL containers is general would be appreciated.

My confusion is how come we don't have to explicitly initialize a vector
This is the expected behavior of std::unordered_map::operator[], which will perform an insertion with value-initialized mapped value if the key does not exist.
Returns a reference to the value that is mapped to a key equivalent to key, performing an insertion if such key does not already exist.
That means for hashtable[s].push_back("bar");, a value-initialized std::vector (i.e. an empty std::vector) will be inserted at first, then the vector will be returned by reference by std::unordered_map::operator[]. Then push_back("bar") is called on the vector (then its size becomes 1 and contains one element).
isn't pushing "bar" into a vector with garbage values?
No, std::vector is not same as raw array, its size is dynamic. As explained above, the value-initialized std::vector is empty, its size is 0, still doesn't contain any elements (and any "garbage values").

Related

Do vector indices not wrap around in C++? What is the workaround?

When I initialize a vector:
std::vector <int> someVec;
and I call:
std::cout << someVec[-1];
with an arbitrary number of elements, 0 is always returned.
Is there some way to get around this? The inability to do this in C++ is messing up my "sort" function. Is there any way to initialize said vector differently in order to return the last element in the vector, rather than 0 which seems to be the default. It seems that any index called outside the range of the vector will result in 0. It isn't wrapped which is baffling me.
The behaviour of accessing the vector outside the bounds (such as at the index [-1]) is undefined.
Is there anyway to [...] return the last element in the vector
Like this: someVec.back()
If you need a container that returns the last element when using the index [-1], then vector is not the container for your purpose. The standard library has no such container.
The inability to do this in C++ is messing up my "sort" function.
The ideal approach may be to fix the "sort" function to not require access to [-1]. The standard library comes with a sort function though, so you might not even need to implement one yourself.
Is there some way to get around this?
To access the last element in a std::vector use std::back:
std::vector<int> vec(10,1);
std::cout << vec.back(); // prints 1
Note that after default initializing a vector
std::vector<int> vec2;
accessing any index via operator[] is undefined behaviour, because the size of the vector is 0, there aren't any elements to access.

C++: how to initialize vector in map with non-zero size

I have a map of vectors:
std::map<int, std::vector<bool>> mymap
At times, I need to insert a new element:
auto& newvec = mymap[42];
// Add stuff to newvec
As far is I understand (and assuming that 42 is not yet in the map), this will give me newvec with length 0 (constructed as std::vector<bool> {}) which I can then extend.
Is there a way to initialize the vector to some size n right away?
(I am not concerned about performance, just wondering if there is a way to do this).
Wrapping the std::vector<bool>
You could wrap the std::vector<bool> you want to initialise in the following way:
template<size_t N>
struct myvector {
myvector(): data(N) {}
std::vector<bool> data;
};
Then, declare mymap as a map whose value type is of this wrapper type, myvector<N>, instead of std::vector<bool>. For example, for N equal to 100:
std::map<int, myvector<100>> mymap;
If the key 42 does not exist in the map yet, then:
auto& newvec = mymap[42];
will create an instance of type myvector<100>, which in turns, initialises an std::vector<bool> of size 100.
You could access the created std::vector<bool> object either through myvector's data data member or by performing reinterpret_cast<std::vector<bool>&>(newvec).
Using std::map::find() and std::map::emplace()
Another approach would be to use std::map::find() instead of std::map::operator[]() to first find out whether a given key already exists in the map by comparing its returned iterator against the one returned by std::map::end(). If the given key does not exist, then construct the vector using std::map::emplace().
In your example, the newvec could be initialized for this approach by means of the ternary opererator:
auto it = mymap.find(42); // search for an element with the key 42
bool is_key_in_map = it != mymap.end();
// if the element with the given key exists, then return it, otherwise
// construct it
auto& newvec = is_key_in_map? it->second:
mymap.emplace(42, std::vector<bool>(100, true)).first->second;
Actually, you can directly call std::map::emplace() without checking whether the given key already exists, but that will cost the useless creation of a temporary object (i.e., the std::vector<bool> object) if the key is already present in the map:
auto& newvec = mymap.emplace(42, std::vector<bool>(100, true)).first->second;
Since C++17: std::map::try_emplace()
You could use std::map::try_emplace() instead of std::map::emplace():
auto& newvec = mymap.try_emplace(42, 100, true).first->second;
This way, the temporary object, std::vector<bool>(100, true), won't be constructed if the map already contains the given key (i.e., if it already contains the key 42). This is, therefore, more efficient than using std::map::emplace(), since no temporary object will be constructed if not necessary. However, it does require C++17.
Use map::try_emplace() (or map::emplace() before C++17)
std::vector has a constructor which takes an initial size and an initial uniform value. In your case, suppose you want 125 as the initial size. With a stand-alone vector, you would use:
size_t num_bools_we_want = 1234;
std::vector<bool> my_vec(num_bools_we_want, false);
Now, std::map has a method named map::try_emplace() which forwards arguments to a constructor of the value type, which effectively allows you to choose the constructor it will use for a new element. Here's how to use it
mymap.try_emplace(42, num_bools_we_want, false);
to create a value of std::vector<bool>(num_bools_we_want, false) for the key 42. No temporary vectors are created (regardless of compiler optimizations).
The only "problem" with this solution is that try_emplace() only exists since C++17. Since you asked about C++11 - that version of the standard introduced map::emplace(), which does almost the same thing except for an issue with making a copy of the key. See this question for a discussion of the difference between emplace() and try_emplace().
You can use map::emplace member function:
mymap.emplace(42, std::vector<bool>(125, false));
to create a value of std::vector<bool>(125, false) for the key 42.
As ネロク mentions, the above emplace call will construct the value std::vector<bool>(125, false) even if the key 42 already exists in the map (this is also documented in the cppreference page I linked above). If this is to be avoided, you can first check if the value already exists using map::find and insert the value only if the key doesn't exist. That is:
if (mymap.find(42) == mymap.end()) {
mymap.emplace(42, std::vector<bool>(125, false));
}
Both map::find and map::emplace has logarithmic time complexity; hence, calling find before emplace should not hurt the performance too much in performance critical scenarios.

Emplace empty vector into std::map()

How can I emplace an empty vector into a std::map? For example, if I have a std::map<int, std::vector<int>>, and I want map[4] to contain an empty std::vector<int>, what can I call?
If you use operator[](const Key&), the map will automatically emplace a value-initialized (i.e. in the case of std::vector, default-constructed) value if you access an element that does not exist. See here:
http://en.cppreference.com/w/cpp/container/map/operator_at
(Since C++ 11 the details are a tad more complicated, but in your case this is what matters).
That means if your map is empty and you do map[4], it will readily give you a reference to an empty (default-constructed) vector. Assigning an empty vector is unnecessary, although it may make your intent more clear.
Demo: https://godbolt.org/g/rnfW7g
Unfortunately the strictly-correct answer is indeed to use std::piecewise_construct as the first argument, followed by two tuples. The first represents the arguments to create the key (4), and the second represents the arguments to create the vector (empty argument set).
It would look like this:
map.emplace(std::piecewise_construct, // signal piecewise construction
std::make_tuple(4), // key constructed from int(4)
std::make_tuple()); // value is default constructed
Of course this looks unsightly, and other alternatives will work. They may even generate no more code in an optimised build:
This one notionally invokes default-construction and move-assignment, but it is likely that the optimiser will see through it.
map.emplace(4, std::vector<int>());
This one invokes default-construction followed by copy-assignment. But again, the optimiser may well see through it.
map[4] = {};
To ensure an empty vector is placed at position 4, you may simply attempt to clear the vector at position 4.
std::map<int, std::vector<int>> my_map;
my_map[4].clear();
As others have mentioned, the indexing operator for std::map will construct an empty value at the specified index if none already exists. If that is the case, calling clear is redundant. However, if a std::vector<int> does already exist, the call to clear serves to, well, clear the vector there, resulting in an empty vector.
This may be more efficient than my previous approach of assigning to {} (see below), because we probably plan on adding elements to the vector at position 4, and we don't pay any cost of new allocation this way. Additionally, if previous usage of my_map[4] indicates future usage, then our new vector will likely be eventually resized to the nearly the same size as before, meaning we save on reallocation costs.
Previous approach:
just assign to {} and the container should properly construct an empty vector there:
std::map<int, std::vector<int>> my_map;
my_map[4] = {};
std::cout << my_map.size() << std::endl; // prints 1
Demo
Edit: As Jodocus mentions, if you know that the std::map doesn't already contain a vector at position 4, then simply attempting to access the vector at that position will default-construct one, e.g.:
std::map<int, std::vector<int>> my_map;
my_map[4]; // default-constructs a vector there
What's wrong with the simplest possible solution? std::map[4] = {};.
In modern C++, this should do what you want with no or at least, very little, overhead.
If you must use emplace, the best solution I can come up with is this:
std::map<int, std::vector<int>> map;
map.emplace(4, std::vector<int>());
Use piecewise_construct with std::make_tuple:
map.emplace(std::piecewise_construct, std::make_tuple(4), std::make_tuple());
We are inserting an empty vector at position 4.
And if there is a general case like, emplacing a vector of size 100 with 10 filled up then:
map.emplace(std::piecewise_construct, std::make_tuple(4), std::make_tuple(100, 10));
piecewise_construct: This constant value is passed as the first argument to construct a pair object to select the constructor form that constructs its members in place by forwarding the elements of two tuple objects to their respective constructor.

Pointer to Array holding two Vectors

I am having something i can't seem to figure out myself again. I don't know how to call this problem.
vector<int> *integerVectors[2] = { new vector<int>, new vector<int>};
(*integerVectors)[0].push_back(1);
(*integerVectors)[1].push_back(1);
When i run this i get an unhandled exeption. What I want is an Array with 2 indexes and each of them holding a vector.
EDIT: The Problem seems to appear when i start pushing back.
this syntax:
MyType *var[size];
creates an array of pointers to MyType of size size, which means that this:
vector<int> *integerVectors[2];
produces a size 2 array of pointers to integer vectors, a fact which is backed up by your ability to initialize integerVectors with an initializer-list of pointers to integer vectors produced by calls to new.
this:
(*integerVectors)
produces a pointer to your first vector pointer. You then call operator[] on it, which offsets the pointer by the size of a vector. But this is no longer a pointer to your array--if you call it with an argument of greater than 0, you'll be referencing an imaginary vector next to the one pointed by your first vector element.
Then you call push_back on the imaginary vector, naturally leading to massive problems at run time.
You either want to offset before dereferencing, as in #Abstraction's suggestion of
integerVectors[i]->push_back(1);
or you want to avoid C-style arrays. You're already using one vector, and nesting them rather than making arrays of them will avoid much confusion of this type in the future, while preserving the correct syntax:
vector<vector<int>*> integerVectors = {new vector<int>, new vector<int>};
integerVectors[1]->push_back(1);
Without the c-style array, your needed syntax is vastly clearer.
Even better, you could just avoid the pointers altogether and use
vector<vector<int>> integerVectors = {vector<int>{}, vector<int>{}};
integerVectors[1].push_back(1);
The first line of which has a few equivalent syntaxes, as pointed out by #NeilKirk:
vector<vector<int>> integerVectors(2);
vector<vector<int>> integerVectors{{},{}};
vector<vector<int>> integerVectors(2, vector<int>{});
vector<vector<int>> integerVectors(2, {});
The curly braces I'm using in initialization are another way to initialize an object, without the possibility of the compiler confusing it for a function in the most vexing parse. (However, you have to be careful with it around things which can be initialized with initializer lists like here, which is why some of these initializations use parentheses, instead) The first syntax default-initializes two vectors, and so does the second, though it explicitly states the same thing. It can also be modified to produce more complex structures: two vectors both with the elements {1,2,3} could be one of the below:
vector<vector<int>> integerVectors = {{1,2,3},{1,2,3}};
vector<vector<int>> integerVectors{{1,2,3},{1,2,3}};
vector<vector<int>> integerVectors(2, vector<int>{1,2,3});
vector<vector<int>> integerVectors(2, {1,2,3});
This line is the problem
(*integerVectors)[1].push_back(1);
Dereferencing *integerVectors gives you the first vector pointer (equivalent of integerVectors[0]. Then you call you call operator[] with 1 as arguent on this pointer which will give reference to vector with address shifted with one vector size forward (equivalent to *(integerVectors[0]+1) ) which is not valid.
The right syntax is
integerVectors[1]->push_back(1);

Program crashes if I don't define a vector size

Currently learning C++ and stumbled upon this problem
std::vector<int> example;
example[0] = 27;
std::cout << example[0];
This crashes the program, but if I define a size std::vector<int> example(1) it works fine. What I also noticed was that if I use example.push_back(27) instead of example[0] = 27 without defining the size it also works fine. Is there a reasoning behind this?
An empty vector has no elements allocated in memory.
You should use example.push_back(27) instead of trying to subscript it. push_back() allocates a new element and then adds it to the vector. Once that new element is added to the vector, it can be reassigned using example[0] = something.
The reason is very simple: an empty vector has no elements so you may not use the subscript operator. No memory was allocated and no element was created that you could access it with the subscript operator
As for the method push_back then it adds an element to the vector.
Also using the constructor the way you wrote
std::vector<int> example(1)
creates a vector with one element.
To be more specific, you're using the default constructor so the vector is not obligated to allocate any memory (because you did not ask it to).
So if you run the following code:
std::vector<int> e;
cout<< e.size() << endl;
cout<< e.capacity() << endl;
It'll print 0 and 0. So the size is 0
And as the documentation states:
If the container size is greater than n, the function never throws
exceptions (no-throw guarantee). Otherwise, the behavior is undefined.
e.push_back(5);
std::cout << e[0];
Would work.
From: http://www.cplusplus.com/reference/vector/vector/operator[]/
std::vector::operator[]
Access element
Returns a reference to the element at position n in the vector container.
A similar member function, vector::at, has the same behavior as this operator function, except that vector::at is bound-checked and signals if the requested position is out of range by throwing an out_of_range exception.
Portable programs should never call this function with an argument n that is out of range, since this causes undefined behavior.
This can get confusing in STL since std::map, for example:
std::map<int, std::string> myMap;
myMap[0] = "hello"
will create the mapping if one does not already exist.