std multiset insert and keep length fixed - c++

I am interested in inserting elements in a std::multiset but I would like to keep the set fixed length. Every time an element is inserted, the last element will be removed. I came up with the following solution
int main(){
std::multiset<std::pair<double, int>> ms;
for (int i=0; i<10; i++){
ms.insert(std::pair<double, int>(double(rand())/RAND_MAX, i));
}
ms.insert(std::pair<double, int>(0.5, 10));
ms.erase(--ms.end());
for(auto el : ms){std::cout<<el.first<<"\t"<<el.second<<std::endl;}
return 0;
}
I will be doing something similar to this many times in my code on sets of a size in the order of 1000 elements. Is there a more performant way of doing this? I am worried that the erase will cause memory reallocation and slow down the code.

Related

C++ Removing empty elements from array

I only want to add a[i] into the result array if the condition is met, but this method causes empty elements in the array as it adds to result[i]. Is there a better way to do this?
for(int i=0; i<N; i++)
{
if(a[i]>=lower && a[i]<=upper)
{
count++;
result[i]=a[i];
}
}
you can let result stay empty at first, and only push_back a[i] when the condition is met:
std::vector<...> result;
for (int i = 0; i < N; i++)
{
if (a[i] >= lower && a[i] <= upper)
{
result.push_back(a[i]);
}
}
and count you can leave out, as result.size() will tell you how many elements satisfied the condition.
to get a more modern solution, like how Some programmer dude suggested, you can use std::copy_if in combination with std::back_inserter to achieve the same thing:
std::vector<...> result;
std::copy_if(a.begin(), a.end(), std::back_inserter(result),
[&](auto n) {
return n >= lower && n <= upper;
});
Arrays in c++ are dumb.
They are just pointers to the beginning of the array and don't know their length.
If you just arr[i] you have to be sure that you aren't out of bounds. In that case it is undefined behavior as you dont know what part of meory have you written over. You could as well write over a different variable or beginning of another array.
So when you try to add results to an array you already have to have the array created with enough space.
This boilerplate of deleting and creating dumb arrays so that you can grow the array is very efficiently done in std::vector container which remembers number of elements stored, number of elements that could be stored and the array itself. Every time you try to add element when the reserved space is full it creates a new array two times the size of the original one and copy the data over. Which is O(n) in worst case but O(1) in avarege case (it may deviate when the n is under certain threshold)
Then the answer from Stack Danny applies.
Also use emplace_back instead of push_back if you can it is able to construct the data type in place based on the constructor parameters and in other cases it tries to act like push_back. It basically does what you want the fastest way possible so you avoid as much copies as possible.
count=0;
for(int i=0; i<N; i++)
{
if(a[i]>=lower && a[i]<=upper)
{
count++;
result[count] = a[i];
}
}
Try this.
Your code was copying elements from a[i] and pasting it in result[i] at random places.
For example, if a[0] and a[2] meet the required condition, but a[1] doesn't, then your code will do the following:
result[0] = a[0];
result[2] = a[2];
Notice how result[1] remains empty because a[1] didn't meet the required condition. To avoid empty positions in the result array, use another variable for copying instead of i.

What is the time complexity of this particular code?

I have created a simple program which keeps a count of the elements in an array using an unordered map. I wanted to know the time complexity of the program below.
Is it simply O(n) time?
How much time does the operations done on the unordered map require?
(i.e looking for a key in the map and if it is present incrementing its value by 1 and if not initializing the key by 1)
Is this done in constant time or some logarithmic or linear time?
If not in constant time then please suggest me a better approach.
#include <unordered_map>
#include<iostream>
int main()
{
int n;
std::cin >> n;
int arr[100];
for(int i=0;i<n;i++)
std::cin >> arr[i];
std::unordered_map<int, int> dp;
for(int i=0; i<n; i++)
{
if (dp.find(arr[i]) != dp.end())
dp[arr[i]] ++;
else
dp[arr[i]] = 1;
}
}
The documentation says, that std::unordered_map::find() has a complexity of
Constant on average, worst case linear in the size of the container.
So you got an average complexity of O(n) and a worst case complexity of O(n^2).
Addendum:
Since you use ints as keys and no custom hash function, I think it is safe to assume O(1) for find, since you probably wont get to the worst case.

Cache-friendliness std::list vs std::vector

With CPU caches becoming better and better std::vector usually outperforms std::list even when it comes to testing the strengths of a std::list. For this reason, even for situations where I need to delete/insert in the middle of the container I usually pick std::vector but I realized I had never tested this to make sure assumptions were correct. So I set up some test code:
#include <iostream>
#include <chrono>
#include <list>
#include <vector>
#include <random>
void TraversedDeletion()
{
std::random_device dv;
std::mt19937 mt{ dv() };
std::uniform_int_distribution<> dis(0, 100000000);
std::vector<int> vec;
for (int i = 0; i < 100000; ++i)
{
vec.emplace_back(dis(mt));
}
std::list<int> lis;
for (int i = 0; i < 100000; ++i)
{
lis.emplace_back(dis(mt));
}
{
std::cout << "Traversed deletion...\n";
std::cout << "Starting vector measurement...\n";
auto now = std::chrono::system_clock::now();
auto index = vec.size() / 2;
auto itr = vec.begin() + index;
for (int i = 0; i < 10000; ++i)
{
itr = vec.erase(itr);
}
std::cout << "Took " << std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::system_clock::now() - now).count() << " μs\n";
}
{
std::cout << "Starting list measurement...\n";
auto now = std::chrono::system_clock::now();
auto index = lis.size() / 2;
auto itr = lis.begin();
std::advance(itr, index);
for (int i = 0; i < 10000; ++i)
{
auto it = itr;
std::advance(itr, 1);
lis.erase(it);
}
std::cout << "Took " << std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::system_clock::now() - now).count() << " μs\n";
}
}
void RandomAccessDeletion()
{
std::random_device dv;
std::mt19937 mt{ dv() };
std::uniform_int_distribution<> dis(0, 100000000);
std::vector<int> vec;
for (int i = 0; i < 100000; ++i)
{
vec.emplace_back(dis(mt));
}
std::list<int> lis;
for (int i = 0; i < 100000; ++i)
{
lis.emplace_back(dis(mt));
}
std::cout << "Random access deletion...\n";
std::cout << "Starting vector measurement...\n";
std::uniform_int_distribution<> vect_dist(0, vec.size() - 10000);
auto now = std::chrono::system_clock::now();
for (int i = 0; i < 10000; ++i)
{
auto rand_index = vect_dist(mt);
auto itr = vec.begin();
std::advance(itr, rand_index);
vec.erase(itr);
}
std::cout << "Took " << std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::system_clock::now() - now).count() << " μs\n";
std::cout << "Starting list measurement...\n";
now = std::chrono::system_clock::now();
for (int i = 0; i < 10000; ++i)
{
auto rand_index = vect_dist(mt);
auto itr = lis.begin();
std::advance(itr, rand_index);
lis.erase(itr);
}
std::cout << "Took " << std::chrono::duration_cast<std::chrono::microseconds>(std::chrono::system_clock::now() - now).count() << " μs\n";
}
int main()
{
RandomAccessDeletion();
TraversedDeletion();
std::cin.get();
}
All results are compiled with /02 (Maximize speed).
The first, RandomAccessDeletion(), generates a random index and erases this index 10.000 times. My assumptions were right and the vector is indeed a lot faster than the list:
Random access deletion...
Starting vector measurement...
Took 240299 μs
Starting list measurement...
Took 1368205 μs
The vector is about 5.6x faster than the list. We can most likely thank our cache overlords for this performance benefit, even though we need to shift the elements in the vector on every deletion it's impact is less than the lookup time of the list as we can see in the benchmark.
So then I added another test, seen in the TraversedDeletion(). It doesn't use randomized positions to delete but rather it picks an index in the middle of the container and uses that as base iterator, then traverse the container to erase 10.000 times.
My assumptions were that the list would outperform the vector only slightly or as fast as the vector.
The results for the same execution:
Traversed deletion...
Starting vector measurement....
Took 195477 μs
Starting list measurement...
Took 581 μs
Wow. The list is about 336x faster. This is really far off from my expectations. So having a few cache misses in the list doesn't seem to matter at all here as cutting the lookup time for the list weighs in way more.
So the list apparently still has a really strong position when it comes to performance for corner/unusual cases, or are my test cases flawed in some way?
Does this mean that the list nowadays is only a reasonable option for lots of insertions/deletions in the middle of a container when traversing or are there other cases?
Is there a way I could change the vector access & erasure in TraversedDeletion() to make it at least a bit more competition vs the list?
In response to #BoPersson's comment:
vec.erase(it, it+10000) would perform a lot better than doing 10000
separate deletes.
Changing:
for (int i = 0; i < 10000; ++i)
{
itr = vec.erase(itr);
}
To:
vec.erase(itr, itr + 10000);
Gave me:
Starting vector measurement...
Took 19 μs
This is a major improvement already.
In TraversedDeletion you are essentially doing a pop_front but instead of being at the front you are doing it in the middle. For a linked list this is not an issue. Deleting the node is a O(1) operation. Unfortunately when you do this in the vector is it a O(N) operation where N is vec.end() - itr. This is because it has to copy every element from deletion point forward one element. That is why it is so much more expensive in the vector case.
On the other hand in RandomAccessDeletion you are constantly changing the delete point. This means you have an O(N) operation to traverse the list to get to the node to delete and a O(1) to delete the node versus a O(1) traversersal to find the element and a O(N) operation to copy the elements in the vector forward. The reason this is not the same though is the cost to traverse from node to node has a higher constant than it takes to copy the elements in the vector.
The long duration for list in RandomDeletion is due to the time it takes to advance from the beginning of the list to the randomly selected element, an O(N) operation.
TraverseDeletion just increments an iterator, an O(1) operation.
The "fast" part about a vector is "reaching" the element which needs to be accessed (traversing). You don't actually traverse much on the vector in the deletion but only access the first element. ( I would say the adavance-by-one does not make much measurement wise)
The deletion then takes quite a lot of time ( O(n) so when deleting each one by itself it's O(n²) ) due to changing the elements in the memory. Because the deletion changes the memory on the locations after the deleted element you also cannot benefit from prefetching which also is a thing which makes the vector that fast.
I am not sure how much the deletion also would invalidate the caches because the memory beyond the iterator has changed but this can also have a very big impact on the performance.
In the first test, the list had to traverse to the point of deletion, then delete the entry. The time the list took was in traversing for each deletion.
In the second test, the list traversed once, then repeatedly deleted. The time taken was still in the traversal; the deletion was cheap. Except now we don't repeatedly traverse.
For the vector, traversal is free. Deletion takes time. Randomly deleting an element takes less time than it took for the list to traverse to that random element, so vector wins in the first case.
In the second case, the vector does the hard work many many more times than the list does it hard work.
But, the problem is that isn't how you should traverse-and-delete from a vector. It is an acceptable way to do it for a list.
The way you'd write this for a vector is std::remove_if, followed by erase. Or just one erase:
auto index = vec.size() / 2;
auto itr = vec.begin() + index;
vec.erase(itr, itr+10000);
Or, to emulate a more complex decision making process involving erasing elements:
auto index = vec.size() / 2;
auto itr = vec.begin() + index;
int count = 10000;
auto last = std::remove_if( itr, vec.end(),
[&count](auto&&){
if (count <= 0) return false;
--count;
return true;
}
);
vec.erase(last, vec.end());
Almost the only case where list is way faster than vector is when you store an iterator into the list, and you periodically erase at or near that iterator while still traversing the list between such erase actions.
Almost every other use case has a vector use-pattern that matches or exceeds list performance in my experience.
The code cannot always be translated line-for-line, as you have demonstrated.
Every time you erase an element in a vector, it moves the "tail" of the vector over 1.
If you erase 10,000 elements, it moves the "tail" of the vector over 10000 in one step.
If you remove_if, it removes the tail over efficiently, gives you the "wasted" remaining, and you can then remove the waste from the vector.
I want po point out something still not mentioned in this question:
In the std::vector, when you delete an element in the middle, the elements are moved thanks to new move semantics. That is one of the reasons the first test takes this speed, because you are not even copying the elements after the deleted iterator. You could reproduce the experiment with a vector and list of non-copiable type and see how the performance of the list (in comparation) is better.
I would suggest to run the same tests using a more complex data type in the std::vector: instead of an int, use a structure.
Even better use a static C array as a vector element, and then take measurements with different array sizes.
So, you could swap this line of your code:
std::vector<int> vec;
with something like:
const size_t size = 256;
struct TestType { int a[size]; };
std::vector<TestType> vec;
and test with different values of size. The behavior may depend on this parameter.

Increasing the size of a vector in for loop

If i increase the size of a std::vector in a for loop when it is a parameter of the for loop, will it work? Will the for loop recalculate the size of the vector on each iteration?
Example:
for(int i=0; i<myVector.size(); i++)
{
myVector.push_back(new element);
}
Thanks
Yes, myVector.size() will be re-evaluated on each iteration, returning a larger value each time. Therefore, your loop would never end, because it would be like a dog chasing its own tail (assuming that the initial size is non-zero).
If you would like to double the number of elements in the vector, you need to store myVector.size() upfront, and use the stored value in your loop, like this:
size_t origSize = myVector.size();
for(int i=0; i<origSize; i++)
{
myVector.push_back(new element);
}
The condition in a for loop is recalculated each iteration. So that example code will never terminate as the size will keep recalculating and i will never catch up.

How to avoid reallocation using the STL (C++)

This question is derived from the topic:
vector reserve c++
I am using a datastructure of the type vector<vector<vector<double> > >. It is not possible to know the size of each of these vector (except the outer one) before items (doubles) are added. I can get an approximate size (upper bound) on the number of items in each "dimension".
A solution with the shared pointers might be the way to go, but I would like to try a solution where the vector<vector<vector<double> > > simply has .reserve()ed enough space (or in some other way has allocated enough memory).
Will A.reserve(500) (assumming 500 is the size or, alternatively an upper bound on the size) be enough to hold "2D" vectors of large size, say [1000][10000]?
The reason for my question is mainly because I cannot see any way of reasonably estimating the size of the interior of A at the time of .reserve(500).
An example of my question:
vector<vector<vector<int> > > A;
A.reserve(500+1);
vector<vector<int> > temp2;
vector<int> temp1 (666,666);
for(int i=0;i<500;i++)
{
A.push_back(temp2);
for(int j=0; j< 10000;j++)
{
A.back().push_back(temp1);
}
}
Will this ensure that no reallocation is done for A?
If temp2.reserve(100000) and temp1.reserve(1000) were added at creation will this ensure no reallocation at all will occur at all?
In the above please disregard the fact that memory could be wasted due to conservative .reserve() calls.
Thank you all in advance!
your example will cause a lot of copying and allocations.
vector<vector<vector<double>>> A;
A.reserve(500+1);
vector<vector<double>> temp2;
vector<double> temp1 (666,666);
for(int i=0;i<500;i++)
{
A.push_back(temp2);
for(int j=0; j< 10000;j++)
{
A.back().push_back(temp1);
}
}
Q: Will this ensure that no reallocation is done for A?
A: Yes.
Q: If temp2.reserve(100000) and temp1.reserve(1000) where added at creation will this ensure no reallocation at all will occur at all?
A: Here temp1 already knows its own length on creation time and will not be modified, so adding the temp1.reserve(1000) will only force an unneeded reallocation.
I don't know what the vector classes copy in their copy ctor, using A.back().reserve(10000) should work for this example.
Update: Just tested with g++, the capacity of temp2 will not be copied. So temp2.reserve(10000) will not work.
And please use the source formating when you post code, makes it more readable :-).
How can reserving 500 entries in A beforehand be enough for [1000][1000]?
You need to reserve > 1000 for A (which is your actual upperbound value), and then whenever you add an entry to A, reserve in it another 1000 or so (again, the upperbound but for the second value).
i.e.
A.reserve(UPPERBOUND);
for(int i = 0; i < 10000000; ++i)
A[i].reserve(UPPERBOUND);
BTW, reserve reserves the number of elements, not the number of bytes.
The reserve function will work properly for you vector A, but will not work as you are expecting for temp1 and temp2.
The temp1 vector is initialized with a given size, so it will be set with the proper capacity and you don't need to use reserve with this as long as you plan to not increase its size.
Regarding temp2, the capacity attribute is not carried over in a copy. Considering whenever you use push_back function you are adding a copy to your vector, code like this
vector<vector<double>> temp2;
temp2.reserve(1000);
A.push_back(temp2); //A.back().capacity() == 0
you are just increasing the allocated memory for temps that will be deallocated soon and not increasing the vector elements capacity as you expect. If you really want to use vector of vector as your solution, you will have to do something like this
vector<vector<double>> temp2;
A.push_back(temp2);
A.back().reserve(1000); //A.back().capacity() == 1000
I had the same issue one day. A clean way to do this (I think) is to write your own Allocator and use it for the inner vectors (last template parameter of std::vector<>). The idea is to write an allocator that don't actually allocate memory but simply return the right address inside the memory of your outter vector. You can easely know this address if you know the size of each previous vectors.
In order to avoid copy and reallocation for a datastructure such as vector<vector<vector<double> > >, i would suggest the following:
vector<vector<vector<double> > > myVector(FIXED_SIZE);
in order to 'assign' value to it, don't define your inner vectors until you actually know their rquired dimensions, then use swap() instead of assignment:
vector<vector<double> > innerVector( KNOWN_DIMENSION );
myVector[i].swap( innerVector );
Note that push_back() will do a copy operation and might cause reallocation, while swap() won't (assuming same allocator types are used for both vectors).
It seems to me that you need a real matrix class instead of nesting vectors. Have a look at boost, which has some strong sparse matrix classes.
Ok, now I have done some small scale testing on my own. I used a "2DArray" obtained from http://www.tek-tips.com/faqs.cfm?fid=5575 to represent a structure allocating memory static. For the dynamic allocation I used vectors almost as indicated in my original post.
I tested the following code (hr_time is a timing routine found on web which I due to anti spam unfortunately cannot post, but credits to David Bolton for providing it)
#include <vector>
#include "hr_time.h"
#include "2dArray.h"
#include <iostream>
using namespace std;
int main()
{
vector<int> temp;
vector<vector<int> > temp2;
CStopWatch mytimer;
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
temp2.push_back(temp);
for(int j=0; j< 2000; j++)
{
temp2.back().push_back(j);
}
}
mytimer.stopTimer();
cout << "With vectors without reserved: " << mytimer.getElapsedTime() << endl;
vector<int> temp3;
vector<vector<int> > temp4;
temp3.reserve(1001);
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
temp4.push_back(temp3);
for(int j=0; j< 2000; j++)
{
temp4.back().push_back(j);
}
}
mytimer.stopTimer();
cout << "With vectors with reserved: " << mytimer.getElapsedTime() << endl;
int** MyArray = Allocate2DArray<int>(1000,2000);
mytimer.startTimer();
for(int i=0; i<1000; i++)
{
for(int j=0; j< 2000; j++)
{
MyArray[i][j]=j;
}
}
mytimer.stopTimer();
cout << "With 2DArray: " << mytimer.getElapsedTime() << endl;
//Test
for(int i=0; i<1000; i++)
{
for(int j=0; j< 200; j++)
{
//cout << "My Array stores :" << MyArray[i][j] << endl;
}
}
return 0;
}
It turns out that there is approx a factor 10 for these sizes. I should thus reconsider if dynamic allocation is appropriate for my application since speed is of utmost importance!
Why not subclass the inner containers and reserve() in constructors ?
If the Matrix does get really large and spare I'd try a sparse matrix lib too. Otherwise, before messing with allocaters, I'd try replacing vector with deque. A deque won't reallocate on growing and offers almost as fast random access as a vector.
This was more or less answered here. So your code would look something like this:
vector<vector<vector<double> > > foo(maxdim1,
vector<vector<double> >(maxdim2,
vector<double>(maxdim3)));