Erasing multiple objects from a std::vector? - c++

Here is my issue, lets say I have a std::vector with ints in it.
let's say it has 50,90,40,90,80,60,80.
I know I need to remove the second, fifth and third elements. I don't necessarily always know the order of elements to remove, nor how many. The issue is by erasing an element, this changes the index of the other elements. Therefore, how could I erase these and compensate for the index change. (sorting then linearly erasing with an offset is not an option)
Thanks

I am offering several methods:
1. A fast method that does not retain the original order of the elements:
Assign the current last element of the vector to the element to erase, then erase the last element. This will avoid big moves and all indexes except the last will remain constant. If you start erasing from the back, all precomputed indexes will be correct.
void quickDelete( int idx )
{
vec[idx] = vec.back();
vec.pop_back();
}
I see this essentially is a hand-coded version of the erase-remove idiom pointed out by Klaim ...
2. A slower method that retains the original order of the elements:
Step 1: Mark all vector elements to be deleted, i.e. with a special value. This has O(|indexes to delete|).
Step 2: Erase all marked elements using v.erase( remove (v.begin(), v.end(), special_value), v.end() );. This has O(|vector v|).
The total run time is thus O(|vector v|), assuming the index list is shorter than the vector.
3. Another slower method that retains the original order of the elements:
Use a predicate and remove if as described in https://stackoverflow.com/a/3487742/280314 . To make this efficient and respecting the requirement of
not "sorting then linearly erasing with an offset", my idea is to implement the predicate using a hash table and adjust the indexes stored in the hash table as the deletion proceeds on returning true, as Klaim suggested.

Using a predicate and the algorithm remove_if you can achieve what you want : see http://www.cplusplus.com/reference/algorithm/remove_if/
Don't forget to erase the item (see remove-erase idiom).
Your predicate will simply hold the idx of each value to remove and decrease all indexes it keeps each time it returns true.
That said if you can afford just removing each object using the remove-erase idiom, just make your life simple by doing it.

Erase the items backwards. In other words erase the highest index first, then next highest etc. You won't invalidate any previous iterators or indexes so you can just use the obvious approach of multiple erase calls.

I would move the elements which you don't want to erase to a temporary vector and then replace the original vector with this.

While this answer by Peter G. in variant one (the swap-and-pop technique) is the fastest when you do not need to preserve the order, here is the unmentioned alternative which maintains the order.
With C++17 and C++20 the removal of multiple elements from a vector is possible with standard algorithms. The run time is O(N * Log(N)) due to std::stable_partition. There are no external helper arrays, no excessive copying, everything is done inplace. Code is a "one-liner":
template <class T>
inline void erase_selected(std::vector<T>& v, const std::vector<int>& selection)
{
v.resize(std::distance(
v.begin(),
std::stable_partition(v.begin(), v.end(),
[&selection, &v](const T& item) {
return !std::binary_search(
selection.begin(),
selection.end(),
static_cast<int>(static_cast<const T*>(&item) - &v[0]));
})));
}
The code above assumes that selection vector is sorted (if it is not the case, std::sort over it does the job, obviously).
To break this down, let us declare a number of temporaries:
// We need an explicit item index of an element
// to see if it should be in the output or not
int itemIndex = 0;
// The checker lambda returns `true` if the element is in `selection`
auto filter = [&itemIndex, &sorted_sel](const T& item) {
return !std::binary_search(
selection.begin(),
selection.end(),
itemIndex++);
};
This checker lambda is then fed to std::stable_partition algorithm which is guaranteed to call this lambda only once for each element in the original (unpermuted !) array v.
auto end_of_selected = std::stable_partition(
v.begin(),
v.end(),
filter);
The end_of_selected iterator points right after the last element which should remain in the output array, so we now can resize v down. To calculate the number of elements we use the std::distance to get size_t from two iterators.
v.resize(std::distance(v.begin(), end_of_selected));
This is different from the code at the top (it uses itemIndex to keep track of the array element). To get rid of the itemIndex, we capture the reference to source array v and use pointer arithmetic to calculate itemIndex internally.
Over the years (on this and other similar sites) multiple solutions have been proposed, but usually they employ multiple "raw loops" with conditions and some erase/insert/push_back calls. The idea behind stable_partition is explained beautifully in this talk by Sean Parent.
This link provides a similar solution (and it does not assume that selection is sorted - std::find_if instead of std::binary_search is used), but it also employs a helper (incremented) variable which disables the possibility to parallelize processing on larger arrays.
Starting from C++17, there is a new first argument to std::stable_partition (the ExecutionPolicy) which allows auto-parallelization of the algorithm, further reducing the run-time for big arrays. To make yourself believe this parallelization actually works, there is another talk by Hartmut Kaiser explaining the internals.

Would this work:
void DeleteAll(vector<int>& data, const vector<int>& deleteIndices)
{
vector<bool> markedElements(data.size(), false);
vector<int> tempBuffer;
tempBuffer.reserve(data.size()-deleteIndices.size());
for (vector<int>::const_iterator itDel = deleteIndices.begin(); itDel != deleteIndices.end(); itDel++)
markedElements[*itDel] = true;
for (size_t i=0; i<data.size(); i++)
{
if (!markedElements[i])
tempBuffer.push_back(data[i]);
}
data = tempBuffer;
}
It's an O(n) operation, no matter how many elements you delete. You could gain some efficiency by reordering the vector inline (but I think this way it's more readable).

This is non-trival because as you delete elements from the vector, the indexes change.
[0] hi
[1] you
[2] foo
>> delete [1]
[0] hi
[1] foo
If you keep a counter of times you delete an element and if you have a list of indexes you want to delete in sorted order then:
int counter = 0;
for (int k : IndexesToDelete) {
events.erase(events.begin()+ k + counter);
counter -= 1;
}

You can use this method, if the order of the remaining elements doesn't matter
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector< int> vec;
vec.push_back(1);
vec.push_back(-6);
vec.push_back(3);
vec.push_back(4);
vec.push_back(7);
vec.push_back(9);
vec.push_back(14);
vec.push_back(25);
cout << "The elements befor " << endl;
for(int i = 0; i < vec.size(); i++) cout << vec[i] <<endl;
vector< bool> toDeleted;
int YesOrNo = 0;
for(int i = 0; i<vec.size(); i++)
{
cout<<"You need to delete this element? "<<vec[i]<<", if yes enter 1 else enter 0"<<endl;
cin>>YesOrNo;
if(YesOrNo)
toDeleted.push_back(true);
else
toDeleted.push_back(false);
}
//Deleting, beginning from the last element to the first one
for(int i = toDeleted.size()-1; i>=0; i--)
{
if(toDeleted[i])
{
vec[i] = vec.back();
vec.pop_back();
}
}
cout << "The elements after" << endl;
for(int i = 0; i < vec.size(); i++) cout << vec[i] <<endl;
return 0;
}

Here's an elegant solution in case you want to preserve the indices, the idea is to replace the values you want to delete with a special value that is guaranteed not be used anywhere, and then at the very end, you perform the erase itself:
std::vector<int> vec = {1, 2, 3, 4, 5, 6, 7, 8, 9};
// marking 3 elements to be deleted
vec[2] = std::numeric_limits<int>::lowest();
vec[5] = std::numeric_limits<int>::lowest();
vec[3] = std::numeric_limits<int>::lowest();
// erase
vec.erase(std::remove(vec.begin(), vec.end(), std::numeric_limits<int>::lowest()), vec.end());
// print values => 1 2 5 7 8 9
for (const auto& value : vec) std::cout << ' ' << value;
std::cout << std::endl;
It's very quick if you delete a lot of elements because the deletion itself is happening only once. Items can also be deleted in any order that way.
If you use a a struct instead of an int, then you can still mark an element of that struct, for ex dead=true and then use remove_if instead of remove =>
struct MyObj
{
int x;
bool dead = false;
};
std::vector<MyObj> objs = {{1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, {9}};
objs[2].dead = true;
objs[5].dead = true;
objs[3].dead = true;
objs.erase(std::remove_if(objs.begin(), objs.end(), [](const MyObj& obj) { return obj.dead; }), objs.end());
// print values => 1 2 5 7 8 9
for (const auto& obj : objs) std::cout << ' ' << obj.x;
std::cout << std::endl;
This one is a bit slower, around 80% the speed of the remove.

Related

check if all item array equal in array [duplicate]

If I have a vector of values and want to check that they are all the same, what is the best way to do this in C++ efficiently? If I were programming in some other language like R one way my minds jumps to is to return only the unique elements of the container and then if the length of the unique elements is more than 1, I know all the elements cannot be the same. In C++ this can be done like this:
//build an int vector
std::sort(myvector.begin(), myvector.end());
std::vector<int>::iterator it;
//Use unique algorithm to get the unique values.
it = std::unique(myvector.begin(), myvector.end());
positions.resize(std::distance(myvector.begin(),it));
if (myvector.size() > 1) {
std::cout << "All elements are not the same!" << std::endl;
}
However reading on the internet and SO, I see other answers such using a set or the find_if algorithm. So what is the most efficient way of doing this and why? I imagine mine is not the best way since it involves sorting every element and then a resizing of the vector - but maybe I'm wrong.
You need not to use std::sort. It can be done in a simpler way:
if ( std::adjacent_find( myvector.begin(), myvector.end(), std::not_equal_to<>() ) == myvector.end() )
{
std::cout << "All elements are equal each other" << std::endl;
}
you can use std::equal
version 1:
//assuming v has at least 1 element
if ( std::equal(v.begin() + 1, v.end(), v.begin()) )
{
//all equal
}
This will compare each element with the previous one.
version 2:
//assuming v has at least 1 element
int e = v[0]; //preferably "const auto& e" instead
bool all_equal = true;
for(std::size_t i = 1,s = v.size();i<s && all_equal;i++)
all_equal = e == v[i];
Edit:
Regarding performance, after testing with 100m elements i found out that in Visual Studio 2015 version 1 is about twice as fast as version 2. This is because the latest compiler for vs2015 uses sse instructions in c++ std implementations when you use ints, float , etc..
if you use _mm_testc_si128 you will get a similar performance to std::equal
using std::all_of and C++11 lambda
if (all_of(values.begin(), values.end(), [&] (int i) {return i == values[0];})){
//all are the same
}
Given no constraints on the vector, you have to iterate through the vector at least once, no matter the approach. So just pick the first element and check that all others are equal to it.
While the asymptotic complexity of std::unique is linear, the actual cost of the operation is probably much larger than you need, and it is an inplace algorithm (it will modify the data as it goes).
The fastest approach is to assume that if the vector contains a single element, it is unique by definition. If the vector contains more elements, then you just need to check whether all of them are exactly equal to the first. For that you only need to find the first element that differs from the first, starting the search from the second. If there is such an element, the elements are not unique.
if (v.size() < 2) return true;
auto different = std::find_if(v.begin()+1, v.end(),
[&v](auto const &x) { x != v[0]; });
return different == v.end();
That is using C++14 syntax, in an C++11 toolchain you can use the correct type in the lambda. In C++03 you could use a combination of std::not, std::bind1st/std::bind2nd and std::equal in place of the lambda.
The cost of this approach is distance(start,different element) comparisons and no copies. Expected and worst case linear cost in the number of comparisons (and no copies!)
Sorting is an O(NlogN) task.
This is easily solvable in O(N), so your current method is poor.
A simple O(N) would be as Luchian Grigore suggests, iterate over the vector, just once, comparing every element to the first element.
if(std::all_of(myvector.begin()+1, myvector.end(), std::bind(std::equal_to<int>(),
std::placeholders::_1, myvector.front())) {
// all members are equal
}
You can use FunctionalPlus(https://github.com/Dobiasd/FunctionalPlus):
std::vector<std::string> things = {"same old", "same old"};
if (fplus::all_the_same(things))
std::cout << "All things being equal." << std::endl;
Maybe something like this. It traverses vector just once and does not mess with the vector content.
std::vector<int> values { 5, 5, 5, 4 };
bool equal = std::count_if(values.begin(), values.end(), [ &values ] (auto size) { return size == values[0]; }) == values.size();
If the values in the vector are something different than basic type you have to implement equality operator.
After taking into account underscore_d remarks, I'm changing possible solution
std::vector<int> values { 5, 5, 5, 4 };
bool equal = std::all_of(values.begin(),values.end(),[ &values ] (auto item) { return item == values[0]; });
In your specific case, iterating over vector element and finding a different element from the first one would be enough. You may even be lucky enough to stop before evaluating all the elements in your vector. (A while loop could be used but I sticked with a for loop for readability reasons)
bool uniqueElt = true;
int firstItem = *myvector.begin();
for (std::vector<int>::const_iterator it = myvector.begin()+1; it != myvector.end() ; ++it) {
if(*it != firstItem) {
uniqueElt = false;
break;
}
}
In case you want to know how many different values your vector contains, you could build a set and check its size to see how many different values are inside:
std::set mySet;
std::copy(mySet.begin(), myvector.begin(), myvector.end());
You can simply use std::count to count all the elements that match the starting element:
std::vector<int> numbers = { 5, 5, 5, 5, 5, 5, 5 };
if (std::count(std::begin(numbers), std::end(numbers), numbers.front()) == numbers.size())
{
std::cout << "Elements are all the same" << std::endl;
}
LLVM provides some independently usable headers+libraries:
#include <llvm/ADT/STLExtras.h>
if (llvm::is_splat(myvector))
std::cout << "All elements are the same!" << std::endl;
https://godbolt.org/z/fQX-jc
for the sake of completeness, because it still isn't the most efficient, you can use std::unique in a more efficient way to decide whether all members are the same, but beware that after using std::unique this way the container is useless:
#include <algorithm>
#include <iterator>
if (std::distance(cntnr.begin(), std::unique(cntnr.begin(), cntnr.end()) == 1)
{
// all members were the same, but
}
Another approach using C++ 14:
bool allEqual = accumulate(v.begin(), v.end(), true, [first = v[0]](bool acc, int b) {
return acc && (b == first);
});
which is also order N.
Here is a readable C++17 solution which might remind students of the other constructors of std::vector:
if (v==std::vector(v.size(),v[0])) {
// you guys are all the same
}
...before C++17, the std::vector rvalue would need its type provided explicitly:
if (v==std::vector<typename decltype(v)::value_type>(v.size(),v[0])) {
// you guys are all the same
}
The C++ function is defined in library in STL. This function operates on whole range of array elements and can save time to run a loop to check each elements one by one. It checks for a given property on every element and returns true when each element in range satisfies specified property, else returns false.
// C++ code to demonstrate working of all_of()
#include <vector>
#include <algorithm>
#include <iostream>
int main()
{
std::vector<int> v(10, 2);
// illustrate all_of
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; }))
{
std::cout << "All numbers are even\n";
}
}

Fastest way to check if all elements in vector have the same value in c++ [duplicate]

If I have a vector of values and want to check that they are all the same, what is the best way to do this in C++ efficiently? If I were programming in some other language like R one way my minds jumps to is to return only the unique elements of the container and then if the length of the unique elements is more than 1, I know all the elements cannot be the same. In C++ this can be done like this:
//build an int vector
std::sort(myvector.begin(), myvector.end());
std::vector<int>::iterator it;
//Use unique algorithm to get the unique values.
it = std::unique(myvector.begin(), myvector.end());
positions.resize(std::distance(myvector.begin(),it));
if (myvector.size() > 1) {
std::cout << "All elements are not the same!" << std::endl;
}
However reading on the internet and SO, I see other answers such using a set or the find_if algorithm. So what is the most efficient way of doing this and why? I imagine mine is not the best way since it involves sorting every element and then a resizing of the vector - but maybe I'm wrong.
You need not to use std::sort. It can be done in a simpler way:
if ( std::adjacent_find( myvector.begin(), myvector.end(), std::not_equal_to<>() ) == myvector.end() )
{
std::cout << "All elements are equal each other" << std::endl;
}
you can use std::equal
version 1:
//assuming v has at least 1 element
if ( std::equal(v.begin() + 1, v.end(), v.begin()) )
{
//all equal
}
This will compare each element with the previous one.
version 2:
//assuming v has at least 1 element
int e = v[0]; //preferably "const auto& e" instead
bool all_equal = true;
for(std::size_t i = 1,s = v.size();i<s && all_equal;i++)
all_equal = e == v[i];
Edit:
Regarding performance, after testing with 100m elements i found out that in Visual Studio 2015 version 1 is about twice as fast as version 2. This is because the latest compiler for vs2015 uses sse instructions in c++ std implementations when you use ints, float , etc..
if you use _mm_testc_si128 you will get a similar performance to std::equal
using std::all_of and C++11 lambda
if (all_of(values.begin(), values.end(), [&] (int i) {return i == values[0];})){
//all are the same
}
Given no constraints on the vector, you have to iterate through the vector at least once, no matter the approach. So just pick the first element and check that all others are equal to it.
While the asymptotic complexity of std::unique is linear, the actual cost of the operation is probably much larger than you need, and it is an inplace algorithm (it will modify the data as it goes).
The fastest approach is to assume that if the vector contains a single element, it is unique by definition. If the vector contains more elements, then you just need to check whether all of them are exactly equal to the first. For that you only need to find the first element that differs from the first, starting the search from the second. If there is such an element, the elements are not unique.
if (v.size() < 2) return true;
auto different = std::find_if(v.begin()+1, v.end(),
[&v](auto const &x) { x != v[0]; });
return different == v.end();
That is using C++14 syntax, in an C++11 toolchain you can use the correct type in the lambda. In C++03 you could use a combination of std::not, std::bind1st/std::bind2nd and std::equal in place of the lambda.
The cost of this approach is distance(start,different element) comparisons and no copies. Expected and worst case linear cost in the number of comparisons (and no copies!)
Sorting is an O(NlogN) task.
This is easily solvable in O(N), so your current method is poor.
A simple O(N) would be as Luchian Grigore suggests, iterate over the vector, just once, comparing every element to the first element.
if(std::all_of(myvector.begin()+1, myvector.end(), std::bind(std::equal_to<int>(),
std::placeholders::_1, myvector.front())) {
// all members are equal
}
You can use FunctionalPlus(https://github.com/Dobiasd/FunctionalPlus):
std::vector<std::string> things = {"same old", "same old"};
if (fplus::all_the_same(things))
std::cout << "All things being equal." << std::endl;
Maybe something like this. It traverses vector just once and does not mess with the vector content.
std::vector<int> values { 5, 5, 5, 4 };
bool equal = std::count_if(values.begin(), values.end(), [ &values ] (auto size) { return size == values[0]; }) == values.size();
If the values in the vector are something different than basic type you have to implement equality operator.
After taking into account underscore_d remarks, I'm changing possible solution
std::vector<int> values { 5, 5, 5, 4 };
bool equal = std::all_of(values.begin(),values.end(),[ &values ] (auto item) { return item == values[0]; });
In your specific case, iterating over vector element and finding a different element from the first one would be enough. You may even be lucky enough to stop before evaluating all the elements in your vector. (A while loop could be used but I sticked with a for loop for readability reasons)
bool uniqueElt = true;
int firstItem = *myvector.begin();
for (std::vector<int>::const_iterator it = myvector.begin()+1; it != myvector.end() ; ++it) {
if(*it != firstItem) {
uniqueElt = false;
break;
}
}
In case you want to know how many different values your vector contains, you could build a set and check its size to see how many different values are inside:
std::set mySet;
std::copy(mySet.begin(), myvector.begin(), myvector.end());
You can simply use std::count to count all the elements that match the starting element:
std::vector<int> numbers = { 5, 5, 5, 5, 5, 5, 5 };
if (std::count(std::begin(numbers), std::end(numbers), numbers.front()) == numbers.size())
{
std::cout << "Elements are all the same" << std::endl;
}
LLVM provides some independently usable headers+libraries:
#include <llvm/ADT/STLExtras.h>
if (llvm::is_splat(myvector))
std::cout << "All elements are the same!" << std::endl;
https://godbolt.org/z/fQX-jc
for the sake of completeness, because it still isn't the most efficient, you can use std::unique in a more efficient way to decide whether all members are the same, but beware that after using std::unique this way the container is useless:
#include <algorithm>
#include <iterator>
if (std::distance(cntnr.begin(), std::unique(cntnr.begin(), cntnr.end()) == 1)
{
// all members were the same, but
}
Another approach using C++ 14:
bool allEqual = accumulate(v.begin(), v.end(), true, [first = v[0]](bool acc, int b) {
return acc && (b == first);
});
which is also order N.
Here is a readable C++17 solution which might remind students of the other constructors of std::vector:
if (v==std::vector(v.size(),v[0])) {
// you guys are all the same
}
...before C++17, the std::vector rvalue would need its type provided explicitly:
if (v==std::vector<typename decltype(v)::value_type>(v.size(),v[0])) {
// you guys are all the same
}
The C++ function is defined in library in STL. This function operates on whole range of array elements and can save time to run a loop to check each elements one by one. It checks for a given property on every element and returns true when each element in range satisfies specified property, else returns false.
// C++ code to demonstrate working of all_of()
#include <vector>
#include <algorithm>
#include <iostream>
int main()
{
std::vector<int> v(10, 2);
// illustrate all_of
if (std::all_of(v.cbegin(), v.cend(), [](int i){ return i % 2 == 0; }))
{
std::cout << "All numbers are even\n";
}
}

Is there a better way of moving elements in a vector

I have a std::vector of elements and would like to move an element to a specified position.
I already have a solution, but I'm courious, if there is a better way to do so.
Let'ts assume I'd like to move the last element to the index pos;
I could do a
auto posToInsert = vecElements.begin();
std::advance(posToInsert, pos);
vecElements.insert(posToInsert, *m_vecRows.rbegin());
vecElements.erase(m_vecRows.rbegin());
but this will reallocate memory.
Sadly a
std::move(vecElements.rbegin(), vecElements.rbegin(), posToInsert);
doesn't do the trick.
My current solution does some swaps, but no new memory allocation
auto newElement = vecElements.rbegin();
for (auto currentPos = vecElements.size()-1; currentPos != pos; --currentPos)
newElement->swap(*(newElement + 1)); // reverseIterator +1 = element before
To clarify it, because #NathanOliver asked ... the remaining ordering of the vector should be preserved.
Is there a better way of doing it?
You could use std::rotate:
#include <algorithm>
#include <vector>
#include <iostream>
int main()
{
std::vector<int> values{1, 2, 3, 4, 5};
std::rotate(values.begin()+2, values.end()-1, values.end());
for(int i: values)
std::cout << i << " ";
std::cout << "\n";
}
try it
Outputs:
1 2 5 3 4
You can probably adjust the iterators used if you need to move an element that isn't at the end.
Move the element out of the vector, erase the element, then insert. This is guaranteed to not reallocate as that only happens when size() > capcity() and that can't happen here because erasing first guarantees that size() <= capcity() - 1
In the case of moving the last element, that would look like
auto temp = std::move(vecElements.back())
vecElements.erase(vecElements.rbegin());
vecElements.insert(posToInsert, std::move(temp));
So, this cost you two moves and no reallocation.

Remove duplicates without using any STL containers

I was asked the following question in a 30-minute interview:
Given an array of integers, remove the duplicates without using any STL containers. For e.g.:
For the input array [1,2,3,4,5,3,3,5,4] the output should be:
[1,2,3,4,5];
Note that the first 3, 4 and 5 have been included, but the subsequent ones have been removed since we have already included them once in the output array. How do we do without using an extra STL container?
In the interview, I assumed that we only have positive integers and suggested using a bit array to mark off every element present in the input (assume every element in the input array as an index of the bit array and update it to 1). Finally, we could iterate over this bit vector, populating (or displaying) the unique elements. However, he was not satisfied with this approach. Any other methods that I could have used?
Thanks.
Just use std::sort() and std::unique():
int arr[] = { 1,2,3,4,5,3,3,5,4 };
std::sort( std::begin(arr), std::end(arr) );
auto end = std::unique( std::begin(arr), std::end(arr) );
Live example
We can first sort the array then check if the next element is equal to the previous one and finally give the answer with the help of another array of size 2 larger than the previous one like this.
Initialize the second array with a value that first array will not take (any number larger/smaller than the limit given) ,suppose 0 for simplicity then
int arr1[] = { 1,2,3,4,5,3,3,5,4 };
int arr2[] = { 0,0,0,0,0,0,0,0,0,0,0 };
std::sort( std::begin(arr1), std::end(arr1) );
int position=1;
arr2[0] = arr1[0];
for(int* i=begin(arr1)+1;i!=end(arr1);i++){
if((*i)!=(*(i-1))){
arr2[position] = (*i);
position++;
}
}
int size = 0;
for(int* i=begin(arr2);i!=end(arr2);i++){
if((*i)!=(*(i+1))){
size++;
}
else{
break;
}
}
int ans[size];
for(int i=0;i<size;i++){
ans[i]=arr2[i];
}
Easy algorithm in O(n^2):
void remove_duplicates(Vec& v) {
// range end
auto it_end = end(v);
for (auto it = begin(v); it != it_end; ++it) {
// remove elements matching *it
it_end = remove(it+1, it_end, *it);
}
// erase now-unused elements
v.erase(it_end, end(v));
}
See also erase-remove idiom
Edit: This is assuming you get a std::vector in, but it would work with C-style arrays too, you would just have to implement the erasure yourself.

How to remove elements from a vector based on a condition in another vector?

I have two equal length vectors from which I want to remove elements based on a condition in one of the vectors. The same removal operation should be applied to both so that the indices match.
I have come up with a solution using std::erase, but it is extremely slow:
vector<myClass> a = ...;
vector<otherClass> b = ...;
assert(a.size() == b.size());
for(size_t i=0; i<a.size(); i++)
{
if( !a[i].alive() )
{
a.erase(a.begin() + i);
b.erase(b.begin() + i);
i--;
}
}
Is there a way that I can do this more efficiently and preferably using stl algorithms?
If order doesn't matter you could swap the elements to the back of the vector and pop them.
for(size_t i=0; i<a.size();)
{
if( !a[i].alive() )
{
std::swap(a[i], a.back());
a.pop_back();
std::swap(b[i], b.back());
b.pop_back();
}
else
++i;
}
If you have to maintain the order you could use std::remove_if. See this answer how to get the index of the dereferenced element in the remove predicate:
a.erase(remove_if(begin(a), end(a),
[b&](const myClass& d) { return b[&d - &*begin(a)].alive(); }),
end(a));
b.erase(remove_if(begin(b), end(b),
[](const otherClass& d) { return d.alive(); }),
end(b));
The reason it's slow is probably due to the O(n^2) complexity. Why not use list instead? As making a pair of a and b is a good idea too.
A quick win would be to run the loop backwards: i.e. start at the end of the vector. This tends to minimise the number of backward shifts due to element removal.
Another approach would be to consider std::vector<std::unique_ptr<myClass>> etc.: then you'll be essentially moving pointers rather than values.
I propose you create 2 new vectors, reserve memory and swap vectors content in the end.
vector<myClass> a = ...;
vector<otherClass> b = ...;
vector<myClass> new_a;
vector<myClass> new_b;
new_a.reserve(a.size());
new_b.reserve(b.size());
assert(a.size() == b.size());
for(size_t i=0; i<a.size(); i++)
{
if( a[i].alive() )
{
new_a.push_back(a[i]);
new_b.push_back(b[i]);
}
}
swap(a, new_a);
swap(b, new_b);
It can be memory consumed, but should work fast.
erasing from the middle of a vector is slow due to it needing to reshuffle everything after the deletion point. consider using another container instead that makes erasing quicker. It depends on your use cases, will you be iterating often? does the data need to be in order? If you aren't iterating often, consider a list. if you need to maintain order, consider a set. if you are iterating often and need to maintain order, depending on the number of elements, it may be quicker to push back all alive elements to a new vector and set a/b to point to that instead.
Also, since the data is intrinsically linked, it seems to make sense to have just one vector containing data a and b in a pair or small struct.
For performance reason need to use next.
Use
vector<pair<myClass, otherClass>>
as say #Basheba and std::sort.
Use special form of std::sort with comparision predicate. And do not enumerate from 0 to n. Use std::lower_bound instead, becouse vector will be sorted. Insertion of element do like say CashCow in this question: "how do you insert the value in a sorted vector?"
I had a similar problem where I had two :
std::<Eigen::Vector3d> points;
std::<Eigen::Vector3d> colors;
for 3D pointclouds in Open3D and after removing the floor, I wanted to delete all points and colors if the points' z coordinate is greater than 0.05. I ended up overwriting the points based on the index and resizing the vector afterward.
bool invert = true;
std::vector<bool> mask = std::vector<bool>(points.size(), invert);
size_t pos = 0;
for (auto & point : points) {
if (point(2) < CONSTANTS::FLOOR_HEIGHT) {
mask.at(pos) = false;
}
++pos;
}
size_t counter = 0;
for (size_t i = 0; i < points.size(); i++) {
if (mask[i]) {
points.at(counter) = points.at(i);
colors.at(counter) = colors.at(i);
++counter;
}
}
points.resize(counter);
colors.resize(counter);
This maintains order and at least in my case, worked almost twice as fast than the remove_if method from the accepted answer:
for 921600 points the runtimes were:
33 ms for the accepted answer
17 ms for this approach.