Unique elements in unordered_set, erasing and adding while iterating?

Unique elements in unordered_set, erasing and adding while iterating? - c++

I am trying to implement a recursion wherein, I'm passing an unordered_set<int> by reference/alias, to reduce space and time complexity. In order to do this, I must iteratively do the following, remove an element, call recursively, and then remove the element from the unordered_set.
This is a sample code for the recursive function for printing all permutations of a vector<int> A is as follows,
void recur(int s, vector<int> &A, vector<int> &curr, vector<vector<int>> &ans, unordered_set<int> &poses){
if(s == A.size()){
ans.push_back(curr);
return;
}
for(unordered_set<int>::iterator it = poses.begin() ; it != poses.end() ; it++){
int temp = *it;
curr[temp] = A[s];
poses.erase(it);
recur(s + 1, A, curr, ans, poses);
poses.insert(temp);
curr[temp] = -1;
}
}
You can assume that I'm passing curr with all -1s initially.
When adding iterating through an unordered_set it is guaranteed to find all unique elements. I was wondering whether it would be the same if I remove and add elements back while iterating. Will the position of the element change in the hashset, or is it dependent on the implementation.
If this is incorrect could someone also suggest some other way to go about this, since I do not want to pass by value, since it would copy the entire thing for all recursive calls on the stack. Any help would be appreciated.

Related

Mapping a function onto a vector recursively

I have a class called mapTriple that has a method that takes an integer vector and multiplies all the values in vector by a private function of the mapTriple class (the function takes an int, and returns the int *3)
I have already set up the classes and the function that triples the integer. I am stuck on the mapTriple method. The method cannot be iterative, it must be recursive.
vector<int> MapTriple::map(vector<int> myVector)
{
if(myVector.size() == 1)
{
myVector[0] = f(myVector[0]);
return myVector;
}
else
{
map(myVector.erase(myVector.begin()+myVector.size()-1));
myVector[myVector.size()-1] = f(myVector[myVector.size()-1]);
return myVector;
}
}
int f (int a)
{
return (a*3);
}
It currently isnt compiling, it is say there is no matching call to map. I have all the .h files and main files etc

erase does not return the modified vector. It returns an iterator after the removed element (which will be end in your case, so you don't need that). Just pass the modified vector itself.
You currently don't re-add the erased element, so even if your code compiled, you would always be returning a vector of length 1 (and the remaining element would be tripled n times if the vector was originally size n).
The correct else branch should be:
else
{
// Store and remove the last element.
int currentElement = myVector.back();
myVector.erase(myVector.end()-1);
// Recursively process the remaining elements.
map(myVector);
// Process and re-add the above element.
myVector.push_back(f(currentElement));
return myVector;
}
However, instead of erasing elements and re-adding them, you could work with the iterators.
using Iterator = std::vector<int>::iterator;
void MapTriple::map(Iterator start, Iterator end)
{
// No elements remaining?
if (start == end)
return;
// Process first element.
*start = f(*start);
// Process remaining elements recursively.
map(start+1, end);
}
While this is pretty elegant, it would of course be even simpler to do this with a simple for loop:
for (auto& e : myVector) e = f(e);
or std::transform:
std::transform(myVector.begin(), myVector.end(), myVector.begin(),
[this](int e) -> { return f(e); });`
It should also be noted that map is probably a mediocre name for this method, if you did using namespace std; as seems to be the case (see also Why is "using namespace std" considered bad practice?).

for each in C++ fails to update the vector

I for_each an vector and changed the vector inside the for-loop. However, when I ran the program, after the program left the for-loop the vector was still unchanged. What caused the problem? If I still want to use for_each loop, how can I fix it?
Here is the code (my solution for leetcode 78):
class Solution {
public:
void print(vector<int> t){
for(int a:t){
cout<<a<<" ";
}
cout<<endl;
}
vector<vector<int>> subsets(vector<int>& nums) {
vector<vector<int>> res;
res.push_back(vector<int>{});
int m=nums.size();
for(int a:nums){
cout<<"processing "<<a<<endl;
for(auto t:res){
vector<int> temp{a};
temp.insert(temp.end(),t.begin(), t.end());
res.push_back(temp);
cout<<"temp is ";
print(temp);
res.reserve();
}
// int s=res.size();
// for(int i=0;i<s;i++){
// vector<int> temp{a};
// temp.insert(temp.end(), res[i].begin(), res[i].end());
// res.push_back(temp);
// }
}
return res;
}
};
If I used the placed I commented out to replace the for_each loop, it gave the correct solution.

The shown code exhibits undefined behavior.
Inside the for-loop:
res.push_back(temp);
Adding new elements to a std::vector invalidates all existing iterators to the vector (there are several edge cases, on this topic, but they are not relevant here). However, this is inside the for-loop itself:
for(auto t:res){
The for-loop iterates over the vector. Range iteration, internally, uses iterators to iterate over the container. As soon as the first push_back here adds a value to the vector, the next iteration of this for-loop is undefined behavior. Game over.

The problem here is that you are looping over the subsets created so far and then add more subsets in the same loop, appending them. There are two problems with this.
First (as pointed out by Sam), vector::push_back() may invalidate iterators which control the loop, thus breaking the code.
Second, even when using a container (such deque or list), where push_back() would not invalidate any pointers, your loop would run indefinitely, as you keep adding new elements.
The correct way is to only loop the subsets created before the loop starts, but then add new subsets, i.e. doubling the number of subsets. The easiest way to achieve this is to use good old index-based loops and allocate/reserve enough subsets (2^n) at the onset.
vector<vector<int>> subsets(vector<int> const& nums)
{
const auto n=nums.size();
vector<vector<int>> subset(1<<n); // there are 2^n distinct subsets
for(size_t i=0,j=1; i!=n; ++i) // loop distinct nums
for(size_t k=j,s=0; s!=k; ++s) { // loop all subsets so far s=0..j-1
auto sset = subset[s]; // extract copy of subset
sset.push_back(nums[i]); // extend it by nums[i]
subset.at(j++) = move(sset); // store it at j
}
return subset;
}

intersecting vectors in c++

I have a vector<vector<int> > A; of size 44,000. Now I need to intersect 'A' with another vector: vector<int> B of size 400,000. The size of inner vectors of A i.e. vector is variable and is of maximum size of 9,000 elements for doing the same I am using the following code:
for(int i=0;i<44000;i++)
vector<int> intersect;
set_intersection(A[i].begin(),A[i].end(),B.begin(),B.end(),
std::back_inserter(intersect));
Is there some way by which I may make the code efficient. All the elements in vector A are sorted i.e. they are of the form ((0,1,4,5),(7,94,830,1000)), etc. That is, all elements of A[i]'s vector < all elements of A[j]'s vector if i<j.
EDIT: One of the solutions which I thought about is to merge all the A[i]'s together into another vector mergedB using:
vector<int> mergedB;
for(int i=0;i<44000;i++)
mergedB.insert(mergedB.begin(),mergedB.end(),A[i])
vector<int> intersect;
set_intersection(mergedB.begin(),mergedB.end(),B.begin(),B.end(),
std::back_inserter(intersect));
However, I am not getting the reason as to why am I getting almost same performance with both the codes. Can someone please help me understand this

As it happens, set_itersection is easy to write.
A fancy way would be to create a concatenating iterator, and go over each element of the lhs vector. But it is easier to write set_intersection manually.
template<class MetaIt, class FilterIt, class Sink>
void meta_intersect(MetaIt mb, MetaIt me, FilterIt b, FilterIt e, Sink sink) {
using std::begin; using std::end;
if (b==e) return;
while (mb != me) {
auto b2 = begin(*mb);
auto e2 = end(*mb);
if (b2==e2) {
++mb;
continue;
}
do {
if (*b2 < *b) {
++b2;
continue;
}
if (*b < *b2) {
++b;
if (b==e) return;
continue;
}
*sink = *b2;
++sink; ++b; ++b2;
if (b==e) return;
} while (b2 != e2);
++mb;
}
}
this does not copy elements, other than into the output vector. It assumes MetaIt is an iterator to containers, FilterIt is an iterator to a compatible container, and Sink is an output iterator.
I attempted to remove all redundant comparisons while keeping the code somewhat readable. There is one redundant check -- we check b!=e and then b==e in the single case where we run out of rhs contents. As this should only happen once, the cost to clarity isn't worth it.
You could possibly make the above more efficient with vectorization on modern hardware. I'm not an expert at that. Mixing vectorization with the meta-iteration is tricky.

Since your vectors are sorted, the simplest and fastest algorithm will be to
Set the current element of both vectors to the first value
Compare the both current elements. If equal you have an interection, so increment both vectors'
If not equal increment the vector with the smallest current element.
Goto 2.

Insert multiple values into vector

I have a std::vector<T> variable. I also have two variables of type T, the first of which represents the value in the vector after which I am to insert, while the second represents the value to insert.
So lets say I have this container: 1,2,1,1,2,2
And the two values are 2 and 3 with respect to their definitions above. Then I wish to write a function which will update the container to instead contain:
1,2,3,1,1,2,3,2,3
I am using c++98 and boost. What std or boost functions might I use to implement this function?
Iterating over the vector and using std::insert is one way, but it gets messy when one realizes that you need to remember to hop over the value you just inserted.

This is what I would probably do:
vector<T> copy;
for (vector<T>::iterator i=original.begin(); i!=original.end(); ++i)
{
copy.push_back(*i);
if (*i == first)
copy.push_back(second);
}
original.swap(copy);
Put a call to reserve in there if you want. You know you need room for at least original.size() elements. You could also do an initial iteraton over the vector (or use std::count) to determine the exact amount of elements to reserve, but without testing, I don't know whether that would improve performance.

I propose a solution that works in place and in O(n) in memory and O(2n) time. Instead of O(n^2) in time by the solution proposed by Laethnes and O(2n) in memory by the solution proposed by Benjamin.
// First pass, count elements equal to first.
std::size_t elems = std::count(data.begin(), data.end(), first);
// Resize so we'll add without reallocating the elements.
data.resize(data.size() + elems);
vector<T>::reverse_iterator end = data.rbegin() + elems;
// Iterate from the end. Move elements from the end to the new end (and so elements to insert will have some place).
for(vector<T>::reverse_iterator new_end = data.rbegin(); end != data.rend() && elems > 0; ++new_end,++end)
{
// If the current element is the one we search, insert second first. (We iterate from the end).
if(*end == first)
{
*new_end = second;
++new_end;
--elems;
}
// Copy the data to the end.
*new_end = *end;
}
This algorithm may be buggy but the idea is to copy only once each elements by:
Firstly count how much elements we'll need to insert.
Secondly by going though the data from the end and moving each elements to the new end.

This is what I probably would do:
typedef ::std::vector<int> MyList;
typedef MyList::iterator MyListIter;
MyList data;
// ... fill data ...
const int searchValue = 2;
const int addValue = 3;
// Find first occurence of searched value
MyListIter iter = ::std::find(data.begin(), data.end(), searchValue);
while(iter != data.end())
{
// We want to add our value after searched one
++iter;
// Insert value and return iterator pointing to the inserted position
// (original iterator is invalid now).
iter = data.insert(iter, addValue);
// This is needed only if we want to be sure that out value won't be used
// - for example if searchValue == addValue is true, code would create
// infinite loop.
++iter;
// Search for next value.
iter = ::std::find(iter, data.end(), searchValue);
}
but as you can see, I couldn't avoid the incrementation you mentioned. But I don't think that would be bad thing: I would put this code to separate functions (probably in some kind of "core/utils" module) and - of course - implement this function as template, so I would write it only once - only once worrying about incrementing value is IMHO acceptable. Very acceptable.
template <class ValueType>
void insertAfter(::std::vector<ValueType> &io_data,
const ValueType &i_searchValue,
const ValueType &i_insertAfterValue);
or even better (IMHO)
template <class ListType, class ValueType>
void insertAfter(ListType &io_data,
const ValueType &i_searchValue,
const ValueType &i_insertAfterValue);
EDIT:
well, I would solve problem little different way: first count number of the searched value occurrence (preferably store in some kind of cache which can be kept and used repeatably) so I could prepare array before (only one allocation) and used memcpy to move original values (for types like int only, of course) or memmove (if the vector allocated size is sufficient already).

In place, O(1) additional memory and O(n) time (Live at Coliru):
template <typename T, typename A>
void do_thing(std::vector<T, A>& vec, T target, T inserted) {
using std::swap;
typedef typename std::vector<T, A>::size_type size_t;
const size_t occurrences = std::count(vec.begin(), vec.end(), target);
if (occurrences == 0) return;
const size_t original_size = vec.size();
vec.resize(original_size + occurrences, inserted);
for(size_t i = original_size - 1, end = i + occurrences; i > 0; --i, --end) {
if (vec[i] == target) {
--end;
}
swap(vec[i], vec[end]);
}
}

C++ Standard Library approach to removing one of a pair of items in a list that satisfy a criterion

Imagine you have an std::list with a set of values in it. For demonstration's sake, we'll say it's just std::list<int>, but in my case they're actually 2D points. Anyway, I want to remove one of a pair of ints (or points) which satisfy some sort of distance criterion. My question is how to approach this as an iteration that doesn't do more than O(N^2) operations.
Example
Source is a list of ints containing:
{ 16, 2, 5, 10, 15, 1, 20 }
If I gave this a distance criterion of 1 (i.e. no item in the list should be within 1 of any other), I'd like to produce the following output:
{ 16, 2, 5, 10, 20 } if I iterated forward or
{ 20, 1, 15, 10, 5 } if I iterated backward
I feel that there must be some awesome way to do this, but I'm stuck with this double loop of iterators and trying to erase items while iterating through the list.

Make a map of "regions", basically, a std::map<coordinates/len, std::vector<point>>.
Add each point to it's region, and each of the 8 neighboring regions O(N*logN). Run the "nieve" algorithm on each of these smaller lists (technically O(N^2) unless theres a maximum density, then it becomes O(N*density)). Finally: On your origional list, iterate through each point, and if it has been removed from any of the 8 mini-lists it was put in, remove it from the list. O(n)
With no limit on density, this is O(N^2), and slow. But this gets faster and faster the more spread out the points are. If the points are somewhat evenly distributed in a known boundary, you can switch to a two dimensional array, making this significantly faster, and if there's a constant limit to the density, that technically makes this a O(N) algorithm.
That is how you sort a list of two variables by the way. The grid/map/2dvector thing.
[EDIT] You mentioned you were having trouble with the "nieve" method too, so here's that:
template<class iterator, class criterion>
iterator RemoveCriterion(iterator begin, iterator end, criterion criter) {
iterator actend = end;
for(iterator L=begin; L != actend; ++L) {
iterator R(L);
for(++R; R != actend;) {
if (criter(*L, *R) {
iterator N(R);
std::rotate(R, ++N, actend);
--actend;
} else
++R;
}
}
return actend;
}
This should work on linked lists, vectors, and similar containers, and works in reverse. Unfortunately, it's kinda slow due to not taking into account the properties of linked lists. It's possible to make much faster versions that only work on linked lists in a specific direction. Note that the return value is important, like with the other mutating algorithms. It can only alter contents of the container, not the container itself, so you'll have to erase all elements after the return value when it finishes.

Cubbi had the best answer, though he deleted it for some reason:
Sounds like it's a sorted list, in which case std::unique will do the job of removing the second element of each pair:
#include <list>
#include <algorithm>
#include <iostream>
#include <iterator>
int main()
{
std::list<int> data = {1,2,5,10,15,16,20};
std::unique_copy(data.begin(), data.end(),
std::ostream_iterator<int>(std::cout, " "),
[](int n, int m){return abs(n-m)<=1;});
std::cout << '\n';
}
demo: https://ideone.com/OnGxk
That trivially extends to other types -- either by changing int to something else, or by defining a template:
template<typename T> void remove_close(std::list<T> &data, int distance)
{
std::unique_copy(data.begin(), data.end(),
std::ostream_iterator<int>(std::cout, " "),
[distance](T n, T m){return abs(n-m)<=distance;});
return data;
}
Which will work for any type that defines operator - and abs to allow finding a distance between two objects.

As a mathematician I am pretty sure there is no 'awesome' way to approaching this problem for an unsorted list. It seems to me that it is a logical necessity to check the criterion for any one element against all previous elements selected in order to determine whether insertion is viable or not. There may be a number of ways to optimize this, depending on the size of the list and the criterion.
Perhaps you could maintain a bitset based on the criterion. E.g. suppose abs(n-m)<1) is the criterion. Suppose the first element is of size 5. This is carried over into the new list. So flip bitset[5] to 1. Then, when you encounter an element of size 6, say, you need only test
!( bitset[5] | bitset[6] | bitset[7])
This would ensure no element is within magnitude 1 of the resulting list. This idea may be difficult to extend for more complicated(non discrete) criterions however.

What about:
struct IsNeighbour : public std::binary_function<int,int,bool>
{
IsNeighbour(int dist)
: distance(dist) {}
bool operator()(int a, int b) const
{ return abs(a-b) <= distance; }
int distance;
};
std::list<int>::iterator iter = lst.begin();
while(iter != lst.end())
{
iter = std::adjacent_find(iter, lst.end(), IsNeighbour(some_distance)));
if(iter != lst.end())
iter = lst.erase(iter);
}
This should have O(n). It searches for the first pair of neighbours (which are at maximum some_distance away from each other) and removes the first of this pair. This is repeated (starting from the found item and not from the beginning, of course) until no pairs are found anymore.
EDIT: Oh sorry, you said any other and not just its next element. In this case the above algorithm only works for a sorted list. So you should sort it first, if neccessary.
You can also use std::unique instead of this custom loop above:
lst.erase(std::unique(lst.begin(), lst.end(), IsNeighbour(some_distance), lst.end());
but this removes the second item of each equal pair, and not the first, so you may have to reverse the iteration direction if this matters.
For 2D points instead of ints (1D points) it is not that easy, as you cannot just sort them by their euclidean distance. So if your real problem is to do it on 2D points, you might rephrase the question to point that out more clearly and remove the oversimplified int example.

I think this will work, as long as you don't mind making copies of the data, but if it's just a pair of integer/floats, that should be pretty low-cost. You're making n^2 comparisons, but you're using std::algorithm and can declare the input vector const.
//calculates the distance between two points and returns true if said distance is
//under its threshold
bool isTooClose(const Point& lhs, const Point& rhs, int threshold = 1);
vector<Point>& vec; //the original vector, passed in
vector<Point>& out; //the output vector, returned however you like
for(b = vec.begin(), e = vec.end(); b != e; b++) {
Point& candidate = *b;
if(find_if(out.begin(),
out.end(),
bind1st(isTooClose, candidate)) == out.end())
{//we didn't find anyone too close to us in the output vector. Let's add!
out.push_back(candidate);
}
}

std::list<>.erase(remove_if(...)) using functors
http://en.wikipedia.org/wiki/Erase-remove_idiom
Update(added code):
struct IsNeighbour : public std::unary_function<int,bool>
{
IsNeighbour(int dist)
: m_distance(dist), m_old_value(0){}
bool operator()(int a)
{
bool result = abs(a-m_old_value) <= m_distance;
m_old_value = a;
return result;
}
int m_distance;
int m_old_value;
};
main function...
std::list<int> data = {1,2,5,10,15,16,20};
data.erase(std::remove_if(data.begin(), data.end(), IsNeighbour(1)), data.end());

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Unique elements in unordered_set, erasing and adding while iterating? - c++

Related

Mapping a function onto a vector recursively

for each in C++ fails to update the vector

intersecting vectors in c++

Insert multiple values into vector

C++ Standard Library approach to removing one of a pair of items in a list that satisfy a criterion

Categories

Resources