I for_each an vector and changed the vector inside the for-loop. However, when I ran the program, after the program left the for-loop the vector was still unchanged. What caused the problem? If I still want to use for_each loop, how can I fix it?
Here is the code (my solution for leetcode 78):
class Solution {
public:
void print(vector<int> t){
for(int a:t){
cout<<a<<" ";
}
cout<<endl;
}
vector<vector<int>> subsets(vector<int>& nums) {
vector<vector<int>> res;
res.push_back(vector<int>{});
int m=nums.size();
for(int a:nums){
cout<<"processing "<<a<<endl;
for(auto t:res){
vector<int> temp{a};
temp.insert(temp.end(),t.begin(), t.end());
res.push_back(temp);
cout<<"temp is ";
print(temp);
res.reserve();
}
// int s=res.size();
// for(int i=0;i<s;i++){
// vector<int> temp{a};
// temp.insert(temp.end(), res[i].begin(), res[i].end());
// res.push_back(temp);
// }
}
return res;
}
};
If I used the placed I commented out to replace the for_each loop, it gave the correct solution.
The shown code exhibits undefined behavior.
Inside the for-loop:
res.push_back(temp);
Adding new elements to a std::vector invalidates all existing iterators to the vector (there are several edge cases, on this topic, but they are not relevant here). However, this is inside the for-loop itself:
for(auto t:res){
The for-loop iterates over the vector. Range iteration, internally, uses iterators to iterate over the container. As soon as the first push_back here adds a value to the vector, the next iteration of this for-loop is undefined behavior. Game over.
The problem here is that you are looping over the subsets created so far and then add more subsets in the same loop, appending them. There are two problems with this.
First (as pointed out by Sam), vector::push_back() may invalidate iterators which control the loop, thus breaking the code.
Second, even when using a container (such deque or list), where push_back() would not invalidate any pointers, your loop would run indefinitely, as you keep adding new elements.
The correct way is to only loop the subsets created before the loop starts, but then add new subsets, i.e. doubling the number of subsets. The easiest way to achieve this is to use good old index-based loops and allocate/reserve enough subsets (2^n) at the onset.
vector<vector<int>> subsets(vector<int> const& nums)
{
const auto n=nums.size();
vector<vector<int>> subset(1<<n); // there are 2^n distinct subsets
for(size_t i=0,j=1; i!=n; ++i) // loop distinct nums
for(size_t k=j,s=0; s!=k; ++s) { // loop all subsets so far s=0..j-1
auto sset = subset[s]; // extract copy of subset
sset.push_back(nums[i]); // extend it by nums[i]
subset.at(j++) = move(sset); // store it at j
}
return subset;
}
Related
The behavior that unodered_map elements disappeared unexpectedly in the following C++ code confused me a whole lot. In the first for loop I stored the remainder of each element in time moduled by 60 and its count in unordered_map<int, int> m, in the second for loop, I printed the content in m, so far everything seems working right.
cout as following
0:1
39:1
23:1
18:1
44:1
59:1
12:1
38:1
56:2
17:1
37:1
24:1
58:1
However in the third for loop, it only printed part of elements in m,
0:1
39:1
58:1
0:1
it seems many elements in m were erased by n += m[remainder]*m[60-remainder]; operation. I was so confused by this behavior, could you please understand what is going here? Really so confused.
#include <iostream>
#include<unordered_map>
#include<vector>
using namespace std;
int main() {
vector<int> time ({418,204,77,278,239,457,284,263,372,279,476,416,360,18});
int n =0;
unordered_map<int,int> m; // <remiander,cnt>
for (auto t:time)
m[t%60]++;
for (auto [remainder,cnt]:m)
cout<<remainder<<":"<<cnt<<endl;
cout<<endl;
for (auto [remainder,cnt]:m){
cout<<remainder<<":"<<cnt<<endl;
if (remainder==0 || remainder==30)
n += cnt*(cnt-1)/2;
else
n += m[remainder]*m[60-remainder];
}
}
The third loop uses the [] operator inside the loop.
for (auto [remainder,cnt]:m){
// ...
n += m[remainder]*m[60-remainder];
unordered_map's [] invalidates all existing iterators if it results in a rehash. This includes the implicit iterators employed during range iteration.
As shown, m[remainder] cannot cause a rehash because it can only access an existing value in the unordered map, but this is not true for m[60-remainder], resulting in undefined behavior.
You just need to remove this usage of the [] operator and replace it with the equivalent find() (and, of course, correctly handling the end() value if it gets returned).
I am trying to implement a recursion wherein, I'm passing an unordered_set<int> by reference/alias, to reduce space and time complexity. In order to do this, I must iteratively do the following, remove an element, call recursively, and then remove the element from the unordered_set.
This is a sample code for the recursive function for printing all permutations of a vector<int> A is as follows,
void recur(int s, vector<int> &A, vector<int> &curr, vector<vector<int>> &ans, unordered_set<int> &poses){
if(s == A.size()){
ans.push_back(curr);
return;
}
for(unordered_set<int>::iterator it = poses.begin() ; it != poses.end() ; it++){
int temp = *it;
curr[temp] = A[s];
poses.erase(it);
recur(s + 1, A, curr, ans, poses);
poses.insert(temp);
curr[temp] = -1;
}
}
You can assume that I'm passing curr with all -1s initially.
When adding iterating through an unordered_set it is guaranteed to find all unique elements. I was wondering whether it would be the same if I remove and add elements back while iterating. Will the position of the element change in the hashset, or is it dependent on the implementation.
If this is incorrect could someone also suggest some other way to go about this, since I do not want to pass by value, since it would copy the entire thing for all recursive calls on the stack. Any help would be appreciated.
I have a vector<vector<int> > A; of size 44,000. Now I need to intersect 'A' with another vector: vector<int> B of size 400,000. The size of inner vectors of A i.e. vector is variable and is of maximum size of 9,000 elements for doing the same I am using the following code:
for(int i=0;i<44000;i++)
vector<int> intersect;
set_intersection(A[i].begin(),A[i].end(),B.begin(),B.end(),
std::back_inserter(intersect));
Is there some way by which I may make the code efficient. All the elements in vector A are sorted i.e. they are of the form ((0,1,4,5),(7,94,830,1000)), etc. That is, all elements of A[i]'s vector < all elements of A[j]'s vector if i<j.
EDIT: One of the solutions which I thought about is to merge all the A[i]'s together into another vector mergedB using:
vector<int> mergedB;
for(int i=0;i<44000;i++)
mergedB.insert(mergedB.begin(),mergedB.end(),A[i])
vector<int> intersect;
set_intersection(mergedB.begin(),mergedB.end(),B.begin(),B.end(),
std::back_inserter(intersect));
However, I am not getting the reason as to why am I getting almost same performance with both the codes. Can someone please help me understand this
As it happens, set_itersection is easy to write.
A fancy way would be to create a concatenating iterator, and go over each element of the lhs vector. But it is easier to write set_intersection manually.
template<class MetaIt, class FilterIt, class Sink>
void meta_intersect(MetaIt mb, MetaIt me, FilterIt b, FilterIt e, Sink sink) {
using std::begin; using std::end;
if (b==e) return;
while (mb != me) {
auto b2 = begin(*mb);
auto e2 = end(*mb);
if (b2==e2) {
++mb;
continue;
}
do {
if (*b2 < *b) {
++b2;
continue;
}
if (*b < *b2) {
++b;
if (b==e) return;
continue;
}
*sink = *b2;
++sink; ++b; ++b2;
if (b==e) return;
} while (b2 != e2);
++mb;
}
}
this does not copy elements, other than into the output vector. It assumes MetaIt is an iterator to containers, FilterIt is an iterator to a compatible container, and Sink is an output iterator.
I attempted to remove all redundant comparisons while keeping the code somewhat readable. There is one redundant check -- we check b!=e and then b==e in the single case where we run out of rhs contents. As this should only happen once, the cost to clarity isn't worth it.
You could possibly make the above more efficient with vectorization on modern hardware. I'm not an expert at that. Mixing vectorization with the meta-iteration is tricky.
Since your vectors are sorted, the simplest and fastest algorithm will be to
Set the current element of both vectors to the first value
Compare the both current elements. If equal you have an interection, so increment both vectors'
If not equal increment the vector with the smallest current element.
Goto 2.
I have two code sample, which do exactly same thing. One is in C++03 and C++11.
C++ 11
int main()
{
vector<int> v = {1,2,3};
int count = 0;
for each (auto it in v)
{
cout << it<<endl;
if (count == 0)
{
count++;
v.push_back(4);//adding value to vector
}
}
return 0;
}
C++ 03
int main()
{
vector<int> v = {1,2,3};
int count = 0;
for (vector<int>::iterator it = v.begin(); it != v.end(); it++)
{
cout << *it<<endl;
if (count == 0)
{
count++;
v.push_back(4);//adding value to vector
}
}
return 0;
}
Both the codes are giving following exception.
Now when I see vector::end() implementation,
iterator end() _NOEXCEPT
{
// return iterator for end of mutable sequence
return (iterator(this->_Mylast, this));
}
Here, inline function clearly takes _Mylast to calculate end. So, when I add, it pointer will be incremented to next location, like _Mylast++. Why I am getting this exception?
Thanks.
A vector stores its elements in contiguous memory. If that memory block needs to be reallocated, iterators become invalid.
If you need to modify the vector's size while iterating, iterate by index instead of iterator.
Another option is to use a different container with a different iterator behavior, for example a list will allow you to continue iterating as you insert items.
And finally, (dare I suggest this?) if you know the maximum size your vector will grow to, .reserve() it before iterating over it. This will ensure it doesn't get reallocated during your loop. I am not sure if this behavior is guaranteed by the standard though (maybe someone can chime in); I would definitely not do it, considering iterating by index is perfectly safe.
Your push_back is invalidating the iterator you're using in the for loop, because the vector is reallocating its memory, which invalidates all iterators to elements of the vector.
The idiomatic solution for this is to use an insert_iterator, like the one you get from calling std::back_insterter on the vector. Then you can do:
#include <iostream>
#include <iterator>
#include <vector>
int main()
{
std::vector<int> v;
auto inserter = std::back_inserter(v);
for(int i=0; i<100; ++i)
inserter = i;
for(const auto item : v)
std::cout << item << '\n';
}
And it will ensure its own validity even through reallocation calls of the underlying container.
Live demo here.
I want to know what are difference(s) between vector's push_back and insert functions.
Is there a structural difference(s)?
Is there a really big performance difference(s)?
The biggest difference is their functionality. push_back always puts a new element at the end of the vector and insert allows you to select new element's position. This impacts the performance. vector elements are moved in the memory only when it's necessary to increase it's length because too little memory was allocated for it. On the other hand insert forces to move all elements after the selected position of a new element. You simply have to make a place for it. This is why insert might often be less efficient than push_back.
The functions have different purposes. vector::insert allows you to insert an object at a specified position in the vector, whereas vector::push_back will just stick the object on the end. See the following example:
using namespace std;
vector<int> v = {1, 3, 4};
v.insert(next(begin(v)), 2);
v.push_back(5);
// v now contains {1, 2, 3, 4, 5}
You can use insert to perform the same job as push_back with v.insert(v.end(), value).
Beside the fact, that push_back(x) does the same as insert(x, end()) (maybe with slightly better performance), there are several important thing to know about these functions:
push_back exists only on BackInsertionSequence containers - so, for example, it doesn't exist on set. It couldn't because push_back() grants you that it will always add at the end.
Some containers can also satisfy FrontInsertionSequence and they have push_front. This is satisfied by deque, but not by vector.
The insert(x, ITERATOR) is from InsertionSequence, which is common for set and vector. This way you can use either set or vector as a target for multiple insertions. However, set has additionally insert(x), which does practically the same thing (this first insert in set means only to speed up searching for appropriate place by starting from a different iterator - a feature not used in this case).
Note about the last case that if you are going to add elements in the loop, then doing container.push_back(x) and container.insert(x, container.end()) will do effectively the same thing. However this won't be true if you get this container.end() first and then use it in the whole loop.
For example, you could risk the following code:
auto pe = v.end();
for (auto& s: a)
v.insert(s, pe);
This will effectively copy whole a into v vector, in reverse order, and only if you are lucky enough to not get the vector reallocated for extension (you can prevent this by calling reserve() first); if you are not so lucky, you'll get so-called UndefinedBehavior(tm). Theoretically this isn't allowed because vector's iterators are considered invalidated every time a new element is added.
If you do it this way:
copy(a.begin(), a.end(), back_inserter(v);
it will copy a at the end of v in the original order, and this doesn't carry a risk of iterator invalidation.
[EDIT] I made previously this code look this way, and it was a mistake because inserter actually maintains the validity and advancement of the iterator:
copy(a.begin(), a.end(), inserter(v, v.end());
So this code will also add all elements in the original order without any risk.
I didn't see it in any of the comments above but it is important to know:
If we wish to add a new element to a given vector and the new size of the vector (including the new element) surpasses the current vector capacity it will cause an automatic reallocation of the allocated storage space.
And Beacuse memory allocation is an action we wish to minimize it will increase the capacity both in push_back and insert in the same way (for a vector with n elemeny will add about n/2).
So in terms of memory efficiency it is safe to say, use what ever you like best.
for example:
std::vector<int> test_Insert = { 1,2,3,4,5,6,7 };
std::vector<int> test_Push_Back = { 1,2,3,4,5,6,7 };
std::cout << test_Insert.capacity() << std::endl;
std::cout << test_Push_Back.capacity() << std::endl;
test_Insert.push_back(8);
test_Push_Back.insert(test_Push_Back.end(), 8);
std::cout << test_Insert.capacity() << std::endl;
std::cout << test_Push_Back.capacity() << std::endl;
This code will print:
7
7
10
10
Since there's no actual performance data, I reluctantly wrote some code to produce it. Keep in mind that I wrote this code because I wondered "Should I push_back multiple single elements, or use insert?".
#include <iostream>
#include <vector>
#include <cassert>
#include <chrono>
using namespace std;
vector<float> pushBackTest()
{
vector<float> v;
for(int i =0;i<10000000;i++)
{
// Using a for-loop took 200ms more (in the output)
v.push_back(0);
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(4);
v.push_back(5);
v.push_back(6);
v.push_back(7);
v.push_back(8);
v.push_back(9);
}
return v;
}
vector<float> insertTest()
{
vector<float> v;
for(int i =0;i<10000000;i++)
{
v.insert(v.end(), {0,1,2,3,4,5,6,7,8,9});
}
return v;
}
int main()
{
std::chrono::steady_clock::time_point start = chrono::steady_clock::now();
vector<float> a = pushBackTest();
cout<<"pushBackTest: "<<chrono::duration_cast<chrono::milliseconds>(chrono::steady_clock::now() - start).count()<<"ms"<<endl;
start = std::chrono::steady_clock::now();
vector<float> b = insertTest();
cout<<"insertTest: "<<chrono::duration_cast<chrono::milliseconds>(chrono::steady_clock::now() - start).count()<<"ms"<<endl;
assert(a==b);
return 0;
}
Output:
pushBackTest: 5544ms
insertTest: 3402ms
Since curiosity killed my time, I run a similar test but adding a single number instead of multiple ones.
So, the two new functions are:
vector<float> pushBackTest()
{
vector<float> v;
for(int i =0;i<10000000;i++)
{
v.push_back(1);
}
return v;
}
vector<float> insertTest()
{
vector<float> v;
for(int i =0;i<10000000;i++)
{
v.insert(v.end(), 1);
}
return v;
}
Output:
pushBackTest: 452ms
insertTest: 615ms
So, if you wanna add batch of elements, insert is faster, otherwise push_back it is. Also, keep in mind that push_back can only push... back, yeah.