find the difference between two sets of pointers to the same object

find the difference between two sets of pointers to the same object - c++

How can i find the difference between two sets of pointers to the same object?
Is there an efficient way without iterating through all the objects of both sets.
i have two of these sets:
std::set<Object*>
If an object private member(name) is the same as the other objects name that means that the object is the same.

STL's algorithm library is awesome, extensible, and underused.
This will give you the set difference as a vector (I suppose you could convert that to a set, but there's no need, at least for what you asked, and a vector is faster since the sets are already sorted).
template<typename T>
std::vector<T> set_diff(std::set<T> const &a, std::set<T> const &b) {
std::vector v<T>;
std::set_difference(a.begin(), a.end(), b.begin(), b.end(), v.begin());
return v;
}
Optionally, put after the constructor
v.reserve(a.size() + b.size());
and before the return (C++11)
v.shrink_to_fit();
Note: This yields the items in a that are not in b. To find all items in one of the two but not the other, use std::set_symmetric_difference instead.

I think what you mean different is finding pointer elements which only appear in one set. The most efficient way is to iterate the two sets synchronously and this will cost only O(n+m) time, in which n, m denote the size of two sets, which in general case is the lower bound for the problem.
Luckily, STL container set use balanced binary search tree as its base, we can iterate all the elements in order in linear time, so O(n+m) can be achieved.
template<typename T>
std::vector<T> set_diff(std::set<T> const &a, std::set<T> const &b) {
std::vector<T> v;
auto ita = a.begin();
auto itb = b.begin();
while (ita != a.end() && itb != b.end()) {
if (*ita == *itb) {
++ita, ++itb;
} else if (*ita < *itb) {
v.push_back(*ita);
++ita;
} else {
v.push_back(*itb);
++itb;
}
}
for (; ita != a.end(); v.push_back(*ita), ++ita);
for (; itb != b.end(); v.push_back(*itb), ++itb);
return v;
}

Related

Erasing many vector elements while going through it with 'auto'

Let's say that I have vector of pairs, where each pair corresponds to indexes (row and column) of certain matrix I am working on
using namespace std;
vector<pair<int, int>> vec;
I wanted to, using auto, go through the whole vector and delete at once all the pairs that fulfill certain conditions, for example something like
for (auto& x : vec) {
if (x.first == x.second) {
vec.erase(x);
}
}
but it doesn't work, as I suppose vec.erase() should have an iterator as an argument and x is actually a pair that is an element of vector vec, not iterator. I tried to modify it in few ways, but I am not sure how going through container elements with auto exactly works and how can I fix this.
Can I easily modify the code above to make it work and to erase multiple elements of vector, while going through it with auto? Or I should modify my approach?
For now it's just a vector of pairs, but it will be much worse later on, so I would like to use auto for simplicity.

vector::erase() invalidates any outstanding iterators, including the one your range based for loop is using. Use std::remove_if():
vec.erase(
std::remove_if(
vec.begin(),
vec.end(),
[](const pair<int,int> &xx) { return xx.first == xx.second; }
), vec.end()
);
std::remove_if() swaps the elements to the end of the vector and then you can safely erase them.

I would prefer something like this:
pair<int, int> pair = nullptr;
auto iter = vec.begin();
while(iter != vec.end()){
pair = (*iter);
if(pair.first == pair.second){
iter = this->vec.erase(iter);
}else{
++iter;
}
}

faster erase-remove idiom when I don't care about order and don't have duplicates?

I have a vector of objects and want to delete by value. However the value only occurs once if at all, and I don't care about sorting.
Obviously, if such delete-by-values were extremely common, and/or the data set quite big, a vector wouldn't be the best data structure. But let's say I've determined that not to be the case.
To be clear, if my code were C, I'd be happy with the following:
void delete_by_value( int* const piArray, int& n, int iValue ) {
for ( int i = 0; i < n; i++ ) {
if ( piArray[ i ] == iValue ) {
piArray[ i ] = piArray[ --n ];
return;
}
}
}
It seems that the "modern idiom" approach using std::algos and container methods would be:
v.erase(std::remove(v.begin(), v.end(), iValue), v.end());
But that should be far slower since for a random existent element, it's n/2 moves and n compares. My version is 1 move and n/2 compares.
Surely there's a better way to do this in "the modern idiom" than erase-remove-idiom? And if not why not?

Use std::find to replace the loop. Take the replacement value from the predecessor of the end iterator, and also use that iterator to erase that element. As this iterator is to the last element, erase is cheap. Bonus: bool return for success checking and templateing over int.
template<typename T>
bool delete_by_value(std::vector<T> &v, T const &del) {
auto final = v.end();
auto found = std::find(v.begin(), final, del);
if(found == final) return false;
*found = *--final;
v.erase(final);
return true;
}

Surely there's a better way to do this in "the modern idiom" than erase-remove-idiom?
There aren't a ready-made function for every niche use case in the standard library. Unstable remove is one of the functions that is not provided. It has been proposed (p0041r0) a while back though. Likewise, there are also no special versions of algorithms for the special case of vectors that do not contain duplicates.
So, you'll need to implement the algorithm yourself if you wish to use an optimal algorithm. There is std::find for linear search. After that, you only need to assign from last element and finally pop it off.

Most implementations of std::vector::resize will not reallocate if you make the size of the vector smaller. So, the following will probably have similar performance to the C example.
void find_and_delete(std::vector<int>& v, int value) {
auto it = std::find(v.begin(), v.end(), value);
if (it != v.end()) {
*it = v.back();
v.resize(v.size() - 1);
}
}

C++ way would be mostly identical with std::vector:
template <typename T>
void delete_by_value(std::vector<T>& v, const T& value) {
auto it = std::find(v.begin(), v.end(), value);
if (it != v.end()) {
*it = std::move(v.back());
v.pop_back();
}
}

Want to delete a vector from another vector in C++ [duplicate]

I want to clear a element from a vector using the erase method. But the problem here is that the element is not guaranteed to occur only once in the vector. It may be present multiple times and I need to clear all of them. My code is something like this:
void erase(std::vector<int>& myNumbers_in, int number_in)
{
std::vector<int>::iterator iter = myNumbers_in.begin();
std::vector<int>::iterator endIter = myNumbers_in.end();
for(; iter != endIter; ++iter)
{
if(*iter == number_in)
{
myNumbers_in.erase(iter);
}
}
}
int main(int argc, char* argv[])
{
std::vector<int> myNmbers;
for(int i = 0; i < 2; ++i)
{
myNmbers.push_back(i);
myNmbers.push_back(i);
}
erase(myNmbers, 1);
return 0;
}
This code obviously crashes because I am changing the end of the vector while iterating through it. What is the best way to achieve this? I.e. is there any way to do this without iterating through the vector multiple times or creating one more copy of the vector?

Use the remove/erase idiom:
std::vector<int>& vec = myNumbers; // use shorter name
vec.erase(std::remove(vec.begin(), vec.end(), number_in), vec.end());
What happens is that remove compacts the elements that differ from the value to be removed (number_in) in the beginning of the vector and returns the iterator to the first element after that range. Then erase removes these elements (whose value is unspecified).
Edit: While updating a dead link I discovered that starting in C++20 there are freestanding std::erase and std::erase_if functions that work on containers and simplify things considerably.

Calling erase will invalidate iterators, you could use:
void erase(std::vector<int>& myNumbers_in, int number_in)
{
std::vector<int>::iterator iter = myNumbers_in.begin();
while (iter != myNumbers_in.end())
{
if (*iter == number_in)
{
iter = myNumbers_in.erase(iter);
}
else
{
++iter;
}
}
}
Or you could use std::remove_if together with a functor and std::vector::erase:
struct Eraser
{
Eraser(int number_in) : number_in(number_in) {}
int number_in;
bool operator()(int i) const
{
return i == number_in;
}
};
std::vector<int> myNumbers;
myNumbers.erase(std::remove_if(myNumbers.begin(), myNumbers.end(), Eraser(number_in)), myNumbers.end());
Instead of writing your own functor in this case you could use std::remove:
std::vector<int> myNumbers;
myNumbers.erase(std::remove(myNumbers.begin(), myNumbers.end(), number_in), myNumbers.end());
In C++11 you could use a lambda instead of a functor:
std::vector<int> myNumbers;
myNumbers.erase(std::remove_if(myNumbers.begin(), myNumbers.end(), [number_in](int number){ return number == number_in; }), myNumbers.end());
In C++17 std::experimental::erase and std::experimental::erase_if are also available, in C++20 these are (finally) renamed to std::erase and std::erase_if (note: in Visual Studio 2019 you'll need to change your C++ language version to the latest experimental version for support):
std::vector<int> myNumbers;
std::erase_if(myNumbers, Eraser(number_in)); // or use lambda
or:
std::vector<int> myNumbers;
std::erase(myNumbers, number_in);

You can iterate using the index access,
To avoid O(n^2) complexity
you can use two indices, i - current testing index, j - index to
store next item and at the end of the cycle new size of the vector.
code:
void erase(std::vector<int>& v, int num)
{
size_t j = 0;
for (size_t i = 0; i < v.size(); ++i) {
if (v[i] != num) v[j++] = v[i];
}
// trim vector to new size
v.resize(j);
}
In such case you have no invalidating of iterators, complexity is O(n), and code is very concise and you don't need to write some helper classes, although in some case using helper classes can benefit in more flexible code.
This code does not use erase method, but solves your task.
Using pure stl you can do this in the following way (this is similar to the Motti's answer):
#include <algorithm>
void erase(std::vector<int>& v, int num) {
vector<int>::iterator it = remove(v.begin(), v.end(), num);
v.erase(it, v.end());
}

Depending on why you are doing this, using a std::set might be a better idea than std::vector.
It allows each element to occur only once. If you add it multiple times, there will only be one instance to erase anyway. This will make the erase operation trivial.
The erase operation will also have lower time complexity than on the vector, however, adding elements is slower on the set so it might not be much of an advantage.
This of course won't work if you are interested in how many times an element has been added to your vector or the order the elements were added.

There are std::erase and std::erase_if since C++20 which combines the remove-erase idiom.
std::vector<int> nums;
...
std::erase(nums, targetNumber);
or
std::vector<int> nums;
...
std::erase_if(nums, [](int x) { return x % 2 == 0; });

If you change your code as follows, you can do stable deletion.
void atest(vector<int>& container,int number_in){
for (auto it = container.begin(); it != container.end();) {
if (*it == number_in) {
it = container.erase(it);
} else {
++it;
}
}
}
However, a method such as the following can also be used.
void btest(vector<int>& container,int number_in){
container.erase(std::remove(container.begin(), container.end(), number_in),container.end());
}
If we must preserve our sequence’s order (say, if we’re keeping it sorted by some interesting property), then we can use one of the above. But if the sequence is just a bag of values whose order we don’t care about at all, then we might consider moving single elements from the end of the sequence to fill each new gap as it’s created:
void ctest(vector<int>& container,int number_in){
for (auto it = container.begin(); it != container.end(); ) {
if (*it == number_in) {
*it = std::move(container.back());
container.pop_back();
} else {
++it;
}
}
}
Below are their benchmark results:
CLang 15.0:
Gcc 12.2:

elegant way to remove all elements of a vector that are contained in another vector?

While looking over some code I found loopy and algorithmically slow implementation of std::set_difference
:
for(int i = 0; i < a.size(); i++)
{
iter = std::find(b.begin(),b.end(),a[i]);
if(iter != b.end())
{
b.erase(iter);
}
}
It can be easily replaced with sort(vectors are not sorted) + set_difference, but that requires allocation of new memory(see my recent Q Can output of set difference be stored in first input? why it cant be done "inplace").
So my solution would be something like:
sort(a.begin(), a.end());
for(size_t i = 0; i < b.size(); i++)
{
if (binary_search(a.begin(), a.end(), b[i]))
{
swap(b[i], b[b.size()-1]); //remove current element by swapping with last
b.pop_back(); // and removing new last by shrinking
}
}
can it be done more elegantly?
elegant is subjective so within scope of this Q is defined as clearer code(ideally something from STL algorithms but I think it cant be done) but with no memory allocation and no increase in alg complexity.

This one does it in O(N+M), assuming both arrays are sorted.
auto ib = std::begin(two);
auto iter = std::remove_if (
std::begin(one), std::end(one),
[&ib](int x) -> bool {
while (ib != std::end(two) && *ib < x) ++ib;
return (ib != std::end(two) && *ib == x);
});

Sort b so you can binary search it in order to reduce time complexity. Then use the erase-remove idiom in order to throw away all elements from a that are contained in b:
sort( begin(b), end(b) );
a.erase( remove_if( begin(a),end(a),
[&](auto x){return binary_search(begin(b),end(b),x);}), end(a) );
Of course, you can still sacrifice time complexity for simplicity and reduce your code by removing the sort() and replacing binary_search() by find():
a.erase( remove_if( begin(a),end(a),
[&](auto x){return find(begin(b),end(b),x)!=end(b);}), end(a) );
This is a matter of taste. In both cases you don't need heap allocations. By the way, I'm using lambda auto parameters which are C++14. Some compilers already implement that feature such as clang. If you don't have such a compiler, but only C++11 then replace auto by the element type of the container.
By the way, this code does not mention any types! You can write a template function so it works for all kind of types. The first variant requires random access iteration of b while the second piece of code does not require that.

One solution that comes to mind is combining remove_if and binary_search. It's effectively the same as your manual looping solution but might be a bit more "elegant" as it uses more STL features.
sort(begin(b), end(b));
auto iter = remove_if(begin(a), end(a),
[](auto x) {
return binary_search(begin(b), end(b), x);
});
// Now [begin(a), iter) defines a new range, and you can erase them however
// you see fit, based on the type of a.

The current code is quite clear, in that it should be obvious to any programmer what's going on.
The current performance is O(a.size() * b.size()), which may be pretty bad depending upon the actual sizes.
A more concise and STL-like way to describe it is to use remove_if with a predicate that tells you if a value in in a.
b.erase(std::remove_if(b.begin(), b.end(), [](const auto&x) {
return std::find(a.begin(), a.end(), x) != a.end();
}), b.end());
(Not tested, so I might have made a syntax error.) I used a lambda, but you can create a functor if you're not using a C++11 compiler.
Note that the original code removes just one instance of a value in b that's also in a. My solution will remove all instances of such a value from b.
Note that the find operation happens again and again, so it's probably better to do that on the smaller vector for better locality of reference.

after thinking for a while I thought of this
(note:by answering my own Q im not claiming this is a superior to offered A):
vector<int64_t> a{3,2,7,5,11,13}, b{2,3,13,5};
set<int64_t> bs(b.begin(), b.end());
for (const auto& num: bs)
cout << num << " ";
cout << endl;
for (const auto& num: a)
bs.erase(num);
vector<int64_t> result(bs.begin(), bs.end());
for (const auto& num: result)
cout << num << " ";

What could be reason it crashes when I use vector::erase?

I am trying to do some operation on vector. And calling erase on vector only at some case.
here is my code
while(myQueue.size() != 1)
{
vector<pair<int,int>>::iterator itr = myQueue.begin();
while(itr != myQueue.end())
{
if(itr->first%2 != 0)
myQueue.erase(itr);
else
{
itr->second = itr->second/2;
itr++;
}
}
}
I am getting crash in 2nd iteration.And I am getting this crash with message vector iterator incompatible .
What could be the reason of crash?

If erase() is called the iterator is invalidated and that iterator is then accessed on the next iteration of the loop. std::vector::erase() returns the next iterator after the erased iterator:
itr = myQueue.erase(itr);

Given an iterator range [b, e) where b is the beginning and e one past the end of the range for a vector an erase operation on an iterator i somewhere in the range will invalidate all iterators from i upto e. Which is why you need to be very careful when calling erase. The erase member does return a new iterator which you can you for subsequent operations and you ought to use it:
itr = myQueue.erase( itr );
Another way would be to swap the i element and the last element and then erase the last. This is more efficient since less number of moves of elements beyond i are necessary.
myQueue.swap( i, myQueue.back() );
myQueue.pop_back();
Also, from the looks of it, why are you using vector? If you need a queue you might as well use std::queue.

That is undefined behavior. In particular, once you erase an iterator, it becomes invalid and you can no longer use it for anything. The idiomatic way of unrolling the loop would be something like:
for ( auto it = v.begin(); it != v.end(); ) {
if ( it->first % 2 != 0 )
it = v.erase(it);
else {
it->second /= 2;
++it;
}
}
But then again, it will be more efficient and idiomatic not to roll your own loop and rather use the algorithms:
v.erase( std::remove_if( v.begin(),
v.end(),
[]( std::pair<int,int> const & p ) {
return p.first % 2 != 0;
}),
v.end() );
std::transform( v.begin(), v.end(), v.begin(),
[]( std::pair<int,int> const & p ) {
return std::make_pair(p.first, p.second/2);
} );
The advantage of this approach is that there is a lesser number of copies of the elements while erasing (each valid element left in the range will have been copied no more than once), and it is harder to get it wrong (i.e. misuse an invalidated iterator...) The disadvantage is that there is no remove_if_and_transform so this is a two pass algorithm, which might be less efficient if there is a large number of elements.

Iterating while modifying a loop is generally tricky.
Therefore, there is a specific C++ idiom usable with non-associative sequences: the erase-remove idiom.
It combines the use of the remove_if algorithm with the range overload of the erase method:
myQueue.erase(
std::remove_if(myQueue.begin(), myQueue.end(), /* predicate */),
myQueue.end());
where the predicate is expressed either as a typical functor object or using the new C++11 lambda syntax.
// Functor
struct OddKey {
bool operator()(std::pair<int, int> const& p) const {
return p.first % 2 != 0;
}
};
/* predicate */ = OddKey()
// Lambda
/* predicate */ = [](std::pair<int, int> const& p) { return p.first % 2 != 0; }
The lambda form is more concise but may less self-documenting (no name) and only available in C++11. Depending on your tastes and constraints, pick the one that suits you most.
It is possible to elevate your way of writing code: use Boost.Range.
typedef std::vector< std::pair<int, int> > PairVector;
void pass(PairVector& pv) {
auto const filter = [](std::pair<int, int> const& p) {
return p.first % 2 != 0;
};
auto const transformer = [](std::pair<int, int> const& p) {
return std::make_pair(p.first, p.second / 2);
};
pv.erase(
boost::transform(pv | boost::adaptors::filtered( filter ),
std::back_inserter(pv),
transformer),
pv.end()
);
}
You can find transform and the filtered adaptor in the documentation, along with many others.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

find the difference between two sets of pointers to the same object - c++

Related

Erasing many vector elements while going through it with 'auto'

faster erase-remove idiom when I don't care about order and don't have duplicates?

Want to delete a vector from another vector in C++ [duplicate]

elegant way to remove all elements of a vector that are contained in another vector?

What could be reason it crashes when I use vector::erase?

Categories

Resources