What's the difference between 2d vector and map of vector? - c++

Let's say I've declared
map< int , vector<int> > g1;
vector< vector<int> > g2;
What are the similarities and dissimilarities between these two ?

The similarity is the way you access data, it can be the same syntax:
std::cout << g1[3][2] << std::endl;
std::cout << g2[3][2] << std::endl;
The main difference is the following: the map of vector doesn't have to contain all the indices. Then, you can have, as example, only 3 vectors in your map accessed with keys '17', '1234' and 13579 :
g2[17].resize(10);
g2[1234].resize(5);
g2[13579].resize(100);
If you want the same syntax with a vector of vectors, you need to have at least 13579 vectors (including 13576 empty vector) in your main vector. But this will use a lot of unused space in the memory.
Moreover, in your map, you also can access your vectors with negative keys (which is not possible in the vector of vectors):
g2[-10].resize(10);
After this obviously high difference, the storage of data is different. The vector allocates contiguous memory, while the map is stored as tree. The complexity of access in the vector is O(1), while it's O(log(n)) in the map. I invite you to learn some tutorial about containers in C++ to understand all the differences and the usual way to use them.

They are fundamentally different. While you may be able to do both g2[0] and g1[0], the behavior is vastly different. Assume there is nothing at index 0, then std::map will default construct a new value_type, in this case a vector, and return a reference, whereas std::vector has undefined behavior, but typically either segfaults or returns garbage.
They are also completely different in terms of memory layout. While std::map is backed by a red-black tree, std::vector is contiguous in memory. So inserting into the map will always result in dynamic allocation somewhere in memory, whereas the vector would be resized in case its current capacity is exceeded. Note however, that a vector of vectors is not contiguous in memory. The first vector, which itself is contiguous in memory is made up of vectors which look roughly like this in terms of data:
struct vector
{
T* data;
size_t capacity;
size_t size;
};
Where each of the vectors owns its dynamic memory allocation at data.
The advantage of the map is that is does not have to be densely populated, i.e. you can have something at index 0 and 12902 without all the stuff in between, plus it is sorted. If you don't need the sorted property and can use c++11 consider std::unordered_map. The vector is always densely populated, i.e. at size 10000, elements 0-9999 exist.

With example you can understand the difference. Lets say the vector<int> stores unique ids of people, and map stores respective pincode as key.
map< int , vector<int> > listOfPeopleAtRespectivePinCode;
vector< vector<int> > bunchOfGroupsOfPeople;
Evidently, map are capable to associate key and value (here a list of values), while vector can efficiently store a bunch of data.

Related

How to reserve a multi-dimensional Vector without increasing the vector size?

I have data which is N by 4 which I push back data as follows.
vector<vector<int>> a;
for(some loop){
...
a.push_back(vector<int>(4){val1,val2,val3,val4});
}
N would be less than 13000. In order to prevent unnecessary reallocation, I would like to reserve 13000 by 4 spaces in advance.
After reading multiple related posts on this topic (eg How to reserve a multi-dimensional Vector?), I know the following will do the work. But I would like to do it with reserve() or any similar function if there are any, to be able to use push_back().
vector<vector<int>> a(13000,vector<int>(4);
or
vector<vector<int>> a;
a.resize(13000,vector<int>(4));
How can I just reserve memory without increasing the vector size?
If your data is guaranteed to be N x 4, you do not want to use a std::vector<std::vector<int>>, but rather something like std::vector<std::array<int, 4>>.
Why?
It's the more semantically-accurate type - std::array is designed for fixed-width contiguous sequences of data. (It also opens up the potential for more performance optimizations by the compiler, although that depends on exactly what it is that you're writing.)
Your data will be laid out contiguously in memory, rather than every one of the different vectors allocating potentially disparate heap locations.
Having said that - #pasbi's answer is correct: You can use std::vector::reserve() to allocate space for your outer vector before inserting any actual elements (both for vectors-of-vectors and for vectors-of-arrays). Also, later on, you can use the std::vector::shrink_to_fit() method if you ended up inserting a lot less than you had planned.
Finally, one other option is to use a gsl::multispan and pre-allocate memory for it (GSL is the C++ Core Guidelines Support Library).
You've already answered your own question.
There is a function vector::reserve which does exactly what you want.
vector<vector<int>> a;
a.reserve(N);
for(some loop){
...
a.push_back(vector<int>(4){val1,val2,val3,val4});
}
This will reserve memory to fit N times vector<int>. Note that the actual size of the inner vector<int> is irrelevant at this point since the data of a vector is allocated somewhere else, only a pointer and some bookkeeping is stored in the actual std::vector-class.
Note: this answer is only here for completeness in case you ever come to have a similar problem with an unknown size; keeping a std::vector<std::array<int, 4>> in your case will do perfectly fine.
To pick up on einpoklum's answer, and in case you didn't find this earlier, it is almost always a bad idea to have nested std::vectors, because of the memory layout he spoke of. Each inner vector will allocate its own chunk of data, which won't (necessarily) be contiguous with the others, which will produce cache misses.
Preferably, either:
Like already said, use an std::array if you have a fixed and known amount of elements per vector;
Or flatten your data structure by having a single std::vector<T> of size N x M.
// Assuming N = 13000, M = 4
std::vector<int> vec;
vec.reserve(13000 * 4);
Then you can access it like so:
// Before:
int& element = vec[nIndex][mIndex];
// After:
int& element = vec[mIndex * 13000 + nIndex]; // Still assuming N = 13000

Filling a vector with out-of-order data in C++

I'd like to fill a vector with a (known at runtime) quantity of data, but the elements arrive in (index, value) pairs rather than in the original order. These indices are guaranteed to be unique (each index from 0 to n-1 appears exactly once) so I'd like to store them as follows:
vector<Foo> myVector;
myVector.reserve(n); //total size of data is known
myVector[i_0] = v_0; //data v_0 goes at index i_0 (not necessarily 0)
...
myVector[i_n_minus_1] = v_n_minus_1;
This seems to work fine for the most part; at the end of the code, all n elements are in their proper places in the vector. However, some of the vector functions don't quite work as intended:
...
cout << myVector.size(); //prints 0, not n!
It's important to me that functions like size() still work--I may want to check for example, if all the elements were actually inserted successfully by checking if size() == n. Am I initializing the vector wrong, and if so, how should I approach this otherwise?
myVector.reserve(n) just tells the vector to allocate enough storage for n elements, so that when you push_back new elements into the vector, the vector won't have to continually reallocate more storage -- it may have to do this more than once, because it doesn't know in advance how many elements you will insert. In other words you're helping out the vector implementation by telling it something it wouldn't otherwise know, and allowing it to be more efficient.
But reserve doesn't actually make the vector be n long. The vector is empty, and in fact statements like myVector[0] = something are illegal, because the vector is of size 0: on my implementation I get an assertion failure, "vector subscript out of range". This is on Visual C++ 2012, but I think that gcc is similar.
To create a vector of the required length simply do
vector<Foo> myVector(n);
and forget about the reserve.
(As noted in the comment you an also call resize to set the vector size, but in your case it's simpler to pass the size as the constructor parameter.)
You need to call myVector.resize(n) to set (change) the size of the vector. calling reserve doesn't actually resize the vector, it just makes it so you can later resize without reallocating memory. Writing past the end of the vector (as you are doing here -- the vector size is still 0 when you write to it) is undefined behavior.

dynamically changing name of an array in c++

Hi I have a problem where at compile time, I don't know how many vectors in my program are needed. The number required depends on a data set given at run-time, which will results in the range of vectors needed to be from 1 to N.
So if the data set requires ten vectors, it will create vec1,vec2,......vecN
How can i dynamically create the vectors so that they all have a different name?
I would then need to call each array separately. Presumably
I could just use strings and a few loops for this.
You can't do that directly. However, you could use a map to store a vector name, and the vector itself:
map<string, vector<int> > myMap;
You can add elements simply like this (if the element with such key doesn't exist yet):
vector<int> vec;
myMap["vec"] = vec;
If you'll do it with a key that already exists, the value will be replaced. For example:
vector<int> vec;
vector<int> vec1;
myMap["vec"] = vec;
myMap["vec"] = vec1;//now myMap["vec"] holds the vec1 vector
You can also easlly access elements like this:
myMap["vec"]//this will access the vector with the key "vec1"
You can create a vector to contain your vectors:
std::vector<std::vector<int>> my_vector_of_vectors;
// Add a vector
my_vector_of_vectors.push_back(std::vector<int>{});
// Add a number to the inner vector
my_vector_of_vectors[0].push_back(1);
you have a vector of vectors, vec[0] to vec[n], each one containing the vector.
However, this works est if you know the number of vectors (eg 10). If you need to add new ones on-demand, then a list of vectors or a map might be a better option for you.

Map inserstion failure: can not overwrite first element?

static std::map <unsigned int, CPCSteps> bestKvariables;
inline void copyKBestVar(MaxMinVarMap& vMaxMinAll, size_t& K, std::vector<unsigned int>& temp)
{
// copy top k variables and store in a vector.
MaxMinVarMap::reverse_iterator iter1;
size_t count;
for (iter1 = vMaxMinAll.rbegin(), count = 0; iter1 != vMaxMinAll.rend()&& count <= K; ++iter1, ++count)
{
temp.push_back(iter1->second);
}
}
void myAlgo::phase1(unsigned int& vTarget)
{
CPCSteps KBestForT; // To store kbest variables for only target variable finally put in a global storage
KBestForT.reserve(numVars);
std::vector<unsigned int> tempKbest;
tempKbest.reserve(numVars);
.......
.......
copyKBestVar(mapMinAssoc, KBestSize, tempKbest); // Store k top variables as a k best for this CPC variable for each step
KBestForT.push_back(tempKbest);
.....
.....
bestKvariables.insert(make_pair(vTarget, KBestForT)); // Store k best in a Map
.....
....
}
The Problem: Map "bestKvariables" is not overwriting the first element, but it keeps updating the rest of the elements. I tried to debug it but the problem i found is in insert command.
Thanks in advance for helping.
Another Question: whether i can reserve the size of a map(like vector.reserve(..)) in the beginning to avoid the insertion cost.
Sorry for providing insufficient information.
I means if there are four vTarget variables 1, 2, 3, 4. I do some statistical calculations for each variable.
There are more than one iterations for these variables, i would like to store top k results of each variable in a map to use it next iteration.
I saw the first inserted variable(with key unsigned int "vTarget") is not updated on further iterations(it remains the value inserted on first iteration).
But the other variables (keys inserted after the first) are remains updated.
Another Question: whether i can reserve the size of a map(like vector.reserve(..)) in the beginning to avoid the insertion cost.
std::map does not have a reserve() function unlike std::vector.
Usually, the Standard library provides functions for containers that provide and ensure good performance or provide a means to achieve the same.
For a container like std::vector reallocation of its storage can be a very costly
operation. A simple call to push_back() can lead to every element in the std::vector to be copied to a newly allocated block of memory. A call to reserve() can avoid these unnecessary allocations and copy operations for std::vector and hence the same is provided for it.
std::map never needs to copy all of the existing/remaining elements simply because a new element was inserted or removed. Hence it does not provide any such function.
Though the Standard does not specify how a std::map should be implemented the behavior expected and the complexity desired ensures that most implementations implement it as a tree, unlike std::vector which needs elements to be allocated in contiguous memory locations.
map::insert isn't supposed to update/overwrite anything, just insert not-yet-present elements. Use operator[] for updating, it also inserts elements when the specified key is not present yet.

inserting into the middle of an array

I have an array int *playerNum which stores the list of all the numbers of the players in the team. Each slot e.g playerNum[1]; represents a position on the team, if I wanted to add a new player for a new position on the team. That is, inserting a new element into the array somewhere near the middle, how would I go about doing this?
At the moment, I was thinking you memcpy up to the position you want to insert the player into a new array and then insert the new player and copy over the rest of it?
(I have to use an array)
If you're using C++, I would suggest not using memcpy or memmove but instead using the copy or copy_backward algorithms. These will work on any data type, not just plain old integers, and most implementations are optimized enough that they will compile down to memmove anyway. More importantly, they will work even if you change the underlying type of the elements in the array to something that needs a custom copy constructor or assignment operator.
If you have to use an array, after having made sure you have enough storage (using realloc if necessary), use memmove to shift the items from the insertion point to the end by one position, then save your new player at the desired location.
You can't use memcpy if the source and target areas overlap.
This will fail as soon as the objects in your array have non-trivial copy-constructors, and it's not idiomatic C++. Using one of the container classes is much safer (std::vector or std::list for instance).
Your solution using memcpy is correct (under few assumptions mentionned by other).
However, and since you are programming in C++. It is probably a better choice to use std::vector and its insert method.
vector<int> myvector (3,100);
myvector.insert ( 10 , 42 );
An array takes a contiguous block of memory, there is no function for you to insert an element in the middle. you can create a new one of size larger than the origin's by one then copy the original array into the new one plus the new member
for(int i=0;i<arSize/2;i++)
{
newarray[i]<-ar[i];
}
newarray[i+1]<-newelemant;
for(int j=i+1<newSize;j++,i++)
{
newarray[i]<-ar[i];
}
if you use STL, ting becomes easier, use list.
As you're talking about an array and "insert" I assume that it is a sorted array. You don't necessarily need a second array provided that the capacity N of your existing array is large enough to store more entries (N>n, where n is the number of current entries). You can move the entries from k to n-1 (zero-indexed) to k+1 to n, where k is the desired insert position. Insert the new element at index position k and increase n by one. If the array is not large enough in the beginning, you can follow your proposed approach or just reallocate a new array of larger capacity N' and copy the existing data before applying the actual insert operation described above.
BTW: As you're using C++, you could easily use std::vector.
While it is possible to use arrays for this, C++ has a better solutions to offer. For starters, try std::vector, which is a decent enough general-purpose container, based on a dynamically-allocated array. It behaves exactly like an array in many cases.
Looking at your problem, however, there are two downsides to arrays or vectors:
Indices have to be 0-based and contiguous; you cannot remove elements from the middle without losing key/value associations for everything after the removed element; so if you remove the player on position 4, then the player from position 9 will move to position 8
Random insertion and deletion (that is, anywhere except the end) is expensive - O(n), that is, execution time grows linearly with array size. This is because every time you insert or delete, a part of the array needs to be moved.
If the key/value thing isn't important to you, and insertion/deletion isn't time critical, and your container is never going to be really large, then by all means, use a vector. If you need random insertion/deletion performance, but the key/value thing isn't important, look at std::list (although you won't get random access then, that is, the [] operator isn't defined, as implementing it would be very inefficient for linked lists; linked lists are also very memory hungry, with an overhead of two pointers per element). If you want to maintain key/value associations, std::map is your friend.
Losting the tail:
#include <stdio.h>
#define s 10
int L[s];
void insert(int v, int p, int *a)
{
memmove(a+p+1,a+p,(s-p+1)*4);
*(a+p) = v;
}
int main()
{
for(int i=0;i<s;i++) L[i] = i;
insert(11,6, L);
for(int i=0;i<s;i++) printf("%d %d\n", L[i], &L[i]);
return 0;
}