Erase duplicate element from a vector - c++

I create a vector inside with several elements in c++ and I want to remove the elements of vector with the same values. Basically, I want to remove the whole index of the vector that is found a duplicate element. My vector is called person. I am trying to do something like:
for(int i=0; i < person.size(); i++){
if(i>0 && person.at(i) == person.at(0:i-1)) { // matlab operator
continue;
}
writeToFile( perason.at(i) );
}
How is it possible to create the operator 0:i-1 to check all possible combinations of indexes?
Edit: I am trying GarMan solution but I got issues in for each:
set<string> myset;
vector<string> outputvector;
for (string element:person)
{
if (myset.find(element) != myset.end())
{
myset.insert(element);
outputvector.emplace_back(element);
}
}

Here is an "in-place" version (no second vector required) that should work with older compilers:
std::set<std::string> seen_so_far;
for (std::vector<std::string>::iterator it = person.begin(); it != person.end();)
{
bool was_inserted = seen_so_far.insert(*it).second;
if (was_inserted)
{
++it;
}
else
{
swap(*it, person.back());
person.pop_back();
}
}
Let me know if this works for you. Note that the order of elements is not guaranteed to stay the same.

Something like this will work
unordered_set<same_type_as_vector> myset;
vector<same_type_as_vector> outputvector;
for (auto&& element: myvector)
{
if (myset.find(element) != myset.end())
{
myset.insert(element);
outputvector.emplace_back(element);
}
}
myvector.swap(outputvector);
Code written into reply box, so might need tweaking.

If you can sort your vector, you can simply call std::unique.
#include <algorithm>
std::sort(person.begin(), person.end());
person.erase(std::unique(person.begin(), person.end()), person.end());
If you cannot sort, you can use a hash-table instead by scanning the vector and update the hash-table accordingly. On the same time, you can easily check if one element is already existent or not in O(1) (and O(n) in total). You don't need to check all other elements for each one, which will be time-costly O(n^2).

Related

Deleting from std::list in a nested loop returns access violation

I have a large list of elements, with possible duplicates. I want to delete those duplicates, but my program results in an access violation error after deleting around 700 items.
Here is my code:
for (auto it : endlist){
bool first = true;
for (auto it2 : endlist){
if (!first){
if (similar(it, it2)){
endlist.remove(it2);
continue;
}
rotate( it);
if (similar(it, it2)){
endlist.remove(it2);
continue;
}
rotate(it);
if (similar(it, it2)){
endlist.remove(it2);
continue;
}
rotate(it);
if (similar(it, it2)){
endlist.remove(it2);
continue;
}
}
first = false;
}
}
The access violation is thrown in the second for loop. Can somebody explain why this happens?
Why don't you use
std::list::sort()
then
std::list::unique()
instead? It will get rid of all duplicates in a sorted list.
What you asked for:
for (size_t i=0; i!=endlist.size(); ++i)
{
for (size_t j=i+1; j!=endlist.size(); ++j)// only compare matrices once by using j=i+1
{
if (sometest(endlist[i],endlist[j]))
{
endlist.erase(endlist.begin()+j); // Also resizes the vector.
}
}
}
What you didn't ask:
If you have the ability to change your vector and elements according to your rotation this can be done cleaner with sorting.
For that you'll have to define an operator<(...) for your matrices, this should be possible by comparing their sizes and then comparing them lexicographically. Then you'll want to store the minimal matrix in terms of rotation in your endlist for this to make sense. Once that's done you can use the other answers approach for filtering.
And if you don't want to do anything with the duplicates anyway I'd recommend a container that doesn't allow duplicates from the beginning like a std::map.

C++ insert element at the beginning of a vector using insert()

Below I have a function which is supposed to extract from a vector into another vector the odd numbers and in the old vector i want to insert the even numbers at the beginning of the vector so that later to resize is. I know it is not really an effective way but I have a problem from a book to test the speed of the variant with resize instead of erasing, which I think would have the same speed anyway.. My problem is that in else when I want to insert the element which even, it does not work in this way.. What am I doing wrong in the insert function? Am I passing in a wrong way the iterator?
std::vector<int> extract(std::vector<int>& even)
{
std::vector<int> odd;
std::vector<int>::size_type size = even.size();
std::vector<int>::const_iterator a = even.begin();
std::vector<int>::const_iterator b = even.end();
while (a != b)
{
if ((*a) % 2 != 0)
{
odd.push_back((*a));
}
else
{
even.insert(even.begin(),(*a));
}
a++;
}
//even.resize(size);
return odd;
}

how to find set of distinct strings from a given string after cyclic shifts?

I am solving a [QUESTION][1] in Codeforces where the problem statement asks me to find the set of all distinct strings from a given string after cyclic shifts.
like for example :
Given string :"abcd"
the output should be 4 ("dabc","cdab", "bcda", "abcd")[note:"abcd" is also counted]
So
t=s[l-1];
for(i=l-1;i>0;i--)
{
s[i]=s[i-1];
}
s[0]=t;
I applied above method for length - 1 times for all possible strings but I am unable to find the distinct ones,
is there any STL function to do this?
You may use the following:
std::set<std::string>
retrieve_unique_rotations(std::string s)
{
std::set<std::string> res;
res.insert(s);
if (s.empty()) {
return res;
}
for (std::size_t i = 0, size = s.size() - 1; i != size; ++i) {
std::rotate(s.begin(), s.begin() + 1, s.end());
res.insert(s);
}
return res;
}
Demo
Not sure about STL specific functions, however a general solution could be to have all shifted strings in a list. Then you sort the list and then you iterate over the list elements. When the current element is different to the last, increment the counter.
There is probably a solution that is less memory intensive. For short strings this solution should be sufficient.
You can use vector for making a list after rotating by using vector.push_back("string"). Before each push, You can check if it already exists by using something like:
if (std::find(vector.begin(), vector.end(), "string") != v.end())
{
increment++;
vector.push_back("string");
}
Or else you can count the elements in the end by vector.size(); and remove increment++.
Hope this helps

How to remove elements from a vector based on a condition in another vector?

I have two equal length vectors from which I want to remove elements based on a condition in one of the vectors. The same removal operation should be applied to both so that the indices match.
I have come up with a solution using std::erase, but it is extremely slow:
vector<myClass> a = ...;
vector<otherClass> b = ...;
assert(a.size() == b.size());
for(size_t i=0; i<a.size(); i++)
{
if( !a[i].alive() )
{
a.erase(a.begin() + i);
b.erase(b.begin() + i);
i--;
}
}
Is there a way that I can do this more efficiently and preferably using stl algorithms?
If order doesn't matter you could swap the elements to the back of the vector and pop them.
for(size_t i=0; i<a.size();)
{
if( !a[i].alive() )
{
std::swap(a[i], a.back());
a.pop_back();
std::swap(b[i], b.back());
b.pop_back();
}
else
++i;
}
If you have to maintain the order you could use std::remove_if. See this answer how to get the index of the dereferenced element in the remove predicate:
a.erase(remove_if(begin(a), end(a),
[b&](const myClass& d) { return b[&d - &*begin(a)].alive(); }),
end(a));
b.erase(remove_if(begin(b), end(b),
[](const otherClass& d) { return d.alive(); }),
end(b));
The reason it's slow is probably due to the O(n^2) complexity. Why not use list instead? As making a pair of a and b is a good idea too.
A quick win would be to run the loop backwards: i.e. start at the end of the vector. This tends to minimise the number of backward shifts due to element removal.
Another approach would be to consider std::vector<std::unique_ptr<myClass>> etc.: then you'll be essentially moving pointers rather than values.
I propose you create 2 new vectors, reserve memory and swap vectors content in the end.
vector<myClass> a = ...;
vector<otherClass> b = ...;
vector<myClass> new_a;
vector<myClass> new_b;
new_a.reserve(a.size());
new_b.reserve(b.size());
assert(a.size() == b.size());
for(size_t i=0; i<a.size(); i++)
{
if( a[i].alive() )
{
new_a.push_back(a[i]);
new_b.push_back(b[i]);
}
}
swap(a, new_a);
swap(b, new_b);
It can be memory consumed, but should work fast.
erasing from the middle of a vector is slow due to it needing to reshuffle everything after the deletion point. consider using another container instead that makes erasing quicker. It depends on your use cases, will you be iterating often? does the data need to be in order? If you aren't iterating often, consider a list. if you need to maintain order, consider a set. if you are iterating often and need to maintain order, depending on the number of elements, it may be quicker to push back all alive elements to a new vector and set a/b to point to that instead.
Also, since the data is intrinsically linked, it seems to make sense to have just one vector containing data a and b in a pair or small struct.
For performance reason need to use next.
Use
vector<pair<myClass, otherClass>>
as say #Basheba and std::sort.
Use special form of std::sort with comparision predicate. And do not enumerate from 0 to n. Use std::lower_bound instead, becouse vector will be sorted. Insertion of element do like say CashCow in this question: "how do you insert the value in a sorted vector?"
I had a similar problem where I had two :
std::<Eigen::Vector3d> points;
std::<Eigen::Vector3d> colors;
for 3D pointclouds in Open3D and after removing the floor, I wanted to delete all points and colors if the points' z coordinate is greater than 0.05. I ended up overwriting the points based on the index and resizing the vector afterward.
bool invert = true;
std::vector<bool> mask = std::vector<bool>(points.size(), invert);
size_t pos = 0;
for (auto & point : points) {
if (point(2) < CONSTANTS::FLOOR_HEIGHT) {
mask.at(pos) = false;
}
++pos;
}
size_t counter = 0;
for (size_t i = 0; i < points.size(); i++) {
if (mask[i]) {
points.at(counter) = points.at(i);
colors.at(counter) = colors.at(i);
++counter;
}
}
points.resize(counter);
colors.resize(counter);
This maintains order and at least in my case, worked almost twice as fast than the remove_if method from the accepted answer:
for 921600 points the runtimes were:
33 ms for the accepted answer
17 ms for this approach.

Compare element in a vector with elements in an array

I have two data structures with data in them.
One is a vector std::vector<int> presentStudents And other is a
char array char cAllowedStudents[256];
Now I have to compare these two such that checking every element in vector against the array such that all elements in the vector should be present in the array or else I will return false if there is an element in the vector that's not part of the array.
I want to know the most efficient and simple solution for doing this. I can convert my int vector into a char array and then compare one by one but that would be lengthy operation. Is there some better way of achieving this?
I would suggest you use a hash map (std::unordered_map). Store all the elements of the char array in the hash map.
Then simply sequentially check each element in your vector whether it is present in the map or not in O(1).
Total time complexity O(N), extra space complexity O(N).
Note that you will have to enable C++11 in your compiler.
Please refer to function set_difference() in c++ algorithm header file. You can use this function directly, and check if result diff set is empty or not. If not empty return false.
A better solution would be adapting the implementation of set_difference(), like in here: http://en.cppreference.com/w/cpp/algorithm/set_difference, to return false immediately after you get first different element.
Example adaption:
while (first1 != last1)
{
if (first2 == last2)
return false;
if (*first1 < *first2)
{
return false;
}
else
{
if (*first2 == *first1)
{
++first1;
}
++first2;
}
}
return true;
Sort cAllowedstudents using std::sort.
Iterate over the presentStudents and look for each student in the sorted cAllowedStudents using std::binary_search.
If you don't find an item of the vector, return false.
If all the elements of the vector are found, return true.
Here's a function:
bool check()
{
// Assuming hou have access to cAllowedStudents
// and presentStudents from the function.
char* cend = cAllowedStudents+256;
std::sort(cAllowedStudents, cend);
std::vector<int>::iterator iter = presentStudents.begin();
std::vector<int>::iterator end = presentStudents.end();
for ( ; iter != end; ++iter )
{
if ( !(std::binary_search(cAllowedStudents, cend, *iter)) )
{
return false;
}
}
return true;
}
Another way, using std::difference.
bool check()
{
// Assuming hou have access to cAllowedStudents
// and presentStudents from the function.
char* cend = cAllowedStudents+256;
std::sort(cAllowedStudents, cend);
std::vector<int> diff;
std::set_difference(presentStudents.begin(), presentStudents.end(),
cAllowedStudents, cend,
std::back_inserter(diff));
return (diff.size() == 0);
}
Sort both lists with std::sort and use std::find iteratively on the array.
EDIT: The trick is to use the previously found position as a start for the next search.
std::sort(begin(pS),end(pS))
std::sort(begin(aS),end(aS))
auto its=begin(aS);
auto ite=end(aS);
for (auto s:pS) {
its=std::find(its,ite,s);
if (its == ite) {
std::cout << "Student not allowed" << std::cout;
break;
}
}
Edit: As legends mentiones, it usually might be more efficient to use binary search (as in R Sahu's answer). However, for small arrays and if the vector contains a significant fraction of students from the array (I'd say at least one tenths), the additional overhead of binary search might (or might not) outweight its asymptotic complexity benefits.
Using C++11. In your case, size is 256. Note that I personally have not tested this, or even put it into a compiler. It should, however, give you a good idea of what to do yourself. I HIGHLY recommend testing the edge cases with this!
#include <algorithm>
bool check(const std::vector<int>& studs,
char* allowed,
unsigned int size){
for(auto x : studs){
if(std::find(allowed, allowed+size-1, x) == allowed+size-1 && x!= *(allowed+size))
return false;
}
return true;
}