Keep the duplicated values only - Vectors C++ - c++

Assume I have a vector with the following elements {1, 1, 2, 3, 3, 4}
I want to write a program with c++ code to remove the unique values and keep only the duplicated once. So the end result will be something like this {1,3}.
So far this is what I've done, but it takes a lot of time,
Is there any way this can be more efficient,
vector <int> g1 = {1,1,2,3,3,4}
vector <int> g2;
for(int i = 0; i < g1.size(); i++)
{
if(count(g1.begin(), g1.end(), g1[i]) > 1)
g2.push_back(g1[i]);
}
v.erase(std::unique(g2.begin(), g2.end()), g2.end());
for(int i = 0; i < g2.size(); i++)
{
cout << g2[i];
}

My approach is to create an <algorithm>-style template, and use an unordered_map to do the counting. This means you only iterate over the input list once, and the time complexity is O(n). It does use O(n) extra memory though, and isn't particularly cache-friendly. Also this does assume that the type in the input is hashable.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <unordered_map>
template <typename InputIt, typename OutputIt>
OutputIt copy_duplicates(
InputIt first,
InputIt last,
OutputIt d_first)
{
std::unordered_map<typename std::iterator_traits<InputIt>::value_type,
std::size_t> seen;
for ( ; first != last; ++first) {
if ( 2 == ++seen[*first] ) {
// only output on the second time of seeing a value
*d_first = *first;
++d_first;
}
}
return d_first;
}
int main()
{
int i[] = {1, 2, 3, 1, 1, 3, 5}; // print 1, 3,
// ^ ^
copy_duplicates(std::begin(i), std::end(i),
std::ostream_iterator<int>(std::cout, ", "));
}
This can output to any kind of iterator. There are special iterators you can use that when written to will insert the value into a container.

Here's a way that's a little more cache friendly than unordered_map answer, but is O(n log n) instead of O(n), though it does not use any extra memory and does no allocations. Additionally, the overall multiplier is probably higher, in spite of it's cache friendliness.
#include <vector>
#include <algorithm>
void only_distinct_duplicates(::std::vector<int> &v)
{
::std::sort(v.begin(), v.end());
auto output = v.begin();
auto test = v.begin();
auto run_start = v.begin();
auto const end = v.end();
for (auto test = v.begin(); test != end; ++test) {
if (*test == *run_start) {
if ((test - run_start) == 1) {
*output = *run_start;
++output;
}
} else {
run_start = test;
}
}
v.erase(output, end);
}
I've tested this, and it works. If you want a generic version that should work on any type that vector can store:
template <typename T>
void only_distinct_duplicates(::std::vector<T> &v)
{
::std::sort(v.begin(), v.end());
auto output = v.begin();
auto test = v.begin();
auto run_start = v.begin();
auto const end = v.end();
for (auto test = v.begin(); test != end; ++test) {
if (*test != *run_start) {
if ((test - run_start) > 1) {
::std::swap(*output, *run_start);
++output;
}
run_start = test;
}
}
if ((end - run_start) > 1) {
::std::swap(*output, *run_start);
++output;
}
v.erase(output, end);
}

Assuming the input vector is not sorted, the following will work and is generalized to support any vector with element type T. It will be more efficient than the other solutions proposed so far.
#include <algorithm>
#include <iostream>
#include <vector>
template<typename T>
void erase_unique_and_duplicates(std::vector<T>& v)
{
auto first{v.begin()};
std::sort(first, v.end());
while (first != v.end()) {
auto last{std::find_if(first, v.end(), [&](int i) { return i != *first; })};
if (last - first > 1) {
first = v.erase(first + 1, last);
}
else {
first = v.erase(first);
}
}
}
int main(int argc, char** argv)
{
std::vector<int> v{1, 2, 3, 4, 5, 2, 3, 4};
erase_unique_and_duplicates(v);
// The following will print '2 3 4'.
for (int i : v) {
std::cout << i << ' ';
}
std::cout << '\n';
return 0;
}

I have 2 improvements for you:
You can change your count to start at g1.begin() + i, everything before was handled by the previous iterations of the loop.
You can change the if to == 2 instead of > 1, so it will add numbers only once, independent of the occurences. If a number is 5 times in the vector, the first 3 will be ignored, the 4th will make it into the new vector and the 5th will be ignored again. So you can remove the erase of the duplicates
Example:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int main() {
vector <int> g1 = {1,1,2,3,3,1,4};
vector <int> g2;
for(int i = 0; i < g1.size(); i++)
{
if(count(g1.begin() + i, g1.end(), g1[i]) == 2)
g2.push_back(g1[i]);
}
for(int i = 0; i < g2.size(); i++)
{
cout << g2[i] << " ";
}
cout << endl;
return 0;
}

I'll borrow a principal from Python which is excellent for such operations -
You can use a dictionary where the dictionary-key is the item in the vector and the dictionary-value is the count (start with 1 and increase by one every time you encounter a value that is already in the dictionary).
afterward, create a new vector (or clear the original) with only the dictionary keys that are larger than 1.
Look up in google - std::map
Hope this helps.

In general, that task got complexity about O(n*n), that's why it appears slow. Does it have to be a vector? Is that a restriction? Must it be ordered? If not, it better to actually store values as std::map, which eliminates doubles when populated, or as a std::multimap if presence of doubles matters.

Related

Deleting both an element and its duplicates in a Vector in C++

I've searched the Internet and known how to delete an element (with std::erase) and finding duplicates of an element to then delete it (vec.erase(std::unique(vec.begin(), vec.end()),vec.end());). But all methods only delete either an element or its duplicates.
I want to delete both.
For example, using this vector:
std::vector<int> vec = {2,3,1,5,2,2,5,1};
I want output to be:
{3}
My initial idea was:
void removeDuplicatesandElement(std::vector<int> &vec)
{
std::sort(vec.begin(), vec.end());
int passedNumber = 0; //To tell amount of number not deleted (since not duplicated)
for (int i = 0; i != vec.size(); i = passedNumber) //This is not best practice, but I tried
{
if (vec[i] == vec[i+1])
{
int ctr = 1;
for(int j = i+1; j != vec.size(); j++)
{
if (vec[j] == vec[i]) ctr++;
else break;
}
vec.erase(vec.begin()+i, vec.begin()+i+ctr);
}
else passedNumber++;
}
}
And it worked. But this code is redundant and runs at O(n^2), so I'm trying to find other ways to solve the problem (maybe an STL function that I've never heard of, or just improve the code).
Something like this, perhaps:
void removeDuplicatesandElement(std::vector<int> &vec) {
if (vec.size() <= 1) return;
std::sort(vec.begin(), vec.end());
int cur_val = vec.front() - 1;
auto pred = [&](const int& val) {
if (val == cur_val) return true;
cur_val = val;
// Look ahead to the next element to see if it's a duplicate.
return &val != &vec.back() && (&val)[1] == val;
};
vec.erase(std::remove_if(vec.begin(), vec.end(), pred), vec.end());
}
Demo
This relies heavily on the fact that std::vector is guaranteed to have contiguous storage. It won't work with any other container.
You can do it using STL maps as follows:
#include <iostream>
#include <vector>
#include <unordered_map>
using namespace std;
void retainUniqueElements(vector<int> &A){
unordered_map<int, int> Cnt;
for(auto element:A) Cnt[element]++;
A.clear(); //removes all the elements of A
for(auto i:Cnt){
if(i.second == 1){ // that if the element occurs once
A.push_back(i.first); //then add it in our vector
}
}
}
int main() {
vector<int> vec = {2,3,1,5,2,2,5,1};
retainUniqueElements(vec);
for(auto i:vec){
cout << i << " ";
}
cout << "\n";
return 0;
}
Output:
3
Time Complexity of the above approach: O(n)
Space Complexity of the above approach: O(n)
From what you have searched, we can look in the vector for duplicated values, then use the Erase–remove idiom to clean up the vector.
#include <vector>
#include <algorithm>
#include <iostream>
void removeDuplicatesandElement(std::vector<int> &vec)
{
std::sort(vec.begin(), vec.end());
if (vec.size() < 2)
return;
for (int i = 0; i < vec.size() - 1;)
{
// This is for the case we emptied our vector
if (vec.size() < 2)
return;
// This heavily relies on the fact that this vector is sorted
if (vec[i] == vec[i + 1])
vec.erase(std::remove(vec.begin(), vec.end(), (int)vec[i]), vec.end());
else
i += 1;
}
// Since all duplicates are removed, the remaining elements in the vector are unique, thus the size of the vector
// But we are not returning anything or any reference, so I'm just gonna leave this here
// return vec.size()
}
int main()
{
std::vector<int> vec = {2, 3, 1, 5, 2, 2, 5, 1};
removeDuplicatesandElement(vec);
for (auto i : vec)
{
std::cout << i << " ";
}
std::cout << "\n";
return 0;
}
Output: 3
Time complexity: O(n)

How to remove duplicated items in a sorted vector

I currently have a vector<int> c which contains {1,2,2,4,5,5,6}
and I want to remove the duplicated numbers so that c will have
{1,4,6}. A lot of solutions on the internet seem to just remove one of the duplicates, but I'm trying to remove all occurrences of the duplicated number.
Assume c is always sorted.
I currently have
#include <iostream>
#include <vector>
int main() {
vector<int> c{1,2,2,4,5,5,6};
for (int i = 0; i < c.size()-1; i++) {
for (int j=1; j<c.size();j++){
if(c[i] == c[j]){
// delete i and j?
}
}
}
}
I tried to use two for-loops so that I can compare the current element and the next element. This is where my doubt kicked in. I'm not sure if I'm approaching the problem correctly.
Could I get help on how to approach my problem?
This code is based on the insight that an element is unique in a sorted list if and only if it is different from both elements immediately adjacent to it (except for the starting and ending elements, which are adjacent to one element each). This is true because all identical elements must be adjacent in a sorted array.
void keep_unique(vector <int> &v){
if(v.size() < 2){return;}
vector <int> unique;
// check the first element individually
if(v[0] != v[1]){
unique.push_back(v[0]);
}
for(unsigned int i = 1; i < v.size()-1; ++i){
if(v[i] != v[i-1] && v[i] != v[i+1]){
unique.push_back(v[i]);
}
}
// check the last item individually
if(v[v.size()-1] != v[v.size()-2]){
unique.push_back(v[v.size()-1]);
}
v = unique;
}
Almost any time you find yourself deleting elements from the middle of a vector, it's probably best to sit back and think about whether this is the best way to do the job--chances are pretty good that it isn't.
There are a couple of obvious alternatives to that. One is to copy the items you're going to keep into a temporary vector, then when you're done, swap the temporary vector and the original vector. This works particularly well in a case like you've shown in the question, where you're keeping only a fairly small minority of the input data.
The other is to rearrange the data in your existing vector so all the data you don't want is at the end, and all the data you do want is at the beginning, then resize your vector to eliminate those you don't want.
When I doubt, I tend to go the first route. In theory it's probably a bit less efficient (poorer locality of reference) but I've rarely seen a significant slow-down in real use.
That being the case, my initial take would probably be something on this general order:
#include <vector>
#include <iostream>
#include <iterator>
std::vector<int> remove_all_dupes(std::vector<int> const &input) {
if (input.size() < 2) // zero or one element is automatically unique
return input;
std::vector<int> ret;
// first item is unique if it's different from its successor
if (input[0] != input[1])
ret.push_back(input[0]);
// in the middle, items are unique if they're different from both predecessor and successor
for (std::size_t pos = 1; pos < input.size() - 2; pos++)
if (input[pos] != input[pos-1] && input[pos] != input[pos+1])
ret.push_back(input[pos]);
// last item is unique if it's different from predecessor
if (input[input.size()-1] != input[input.size()-2])
ret.push_back(input[input.size() - 1]);
return ret;
}
int main() {
std::vector<int> c { 1, 2, 2, 4, 5, 5, 6 };
std::vector<int> uniques = remove_all_dupes(c);
std::copy(uniques.begin(), uniques.end(), std::ostream_iterator<int>(std::cout, "\n"));
}
Probably a little longer of code than we'd really prefer, but still simple, straightforward, and efficient.
If you are going to do the job in place, the usual way to do it efficiently (and this applies to filtering in general, not just this particular filter) is to start with a copying phase and follow that by a deletion phase. In the copying phase, you use two pointers: a source and a destination. You start them both at the first element, then advance through the input with the source. If it meets your criteria, you copy it to the destination position, and advance both. If it doesn't meet your criteria, advance only the source.
Then when you're done with that, you resize your vector down to the number of elements you're keeping.
void remove_all_dupes2(std::vector<int> & input) {
if (input.size() < 2) { // 0 or 1 element is automatically unique
return;
}
std::size_t dest = 0;
if (input[0] != input[1])
++dest;
for (std::size_t source = 1; source < input.size() - 2; source++) {
if (input[source] != input[source-1] && input[source] != input[source+1]) {
input[dest++] = input[source];
}
}
if (input[input.size()-1] != input[input.size()-2]) {
input[dest++] = input[input.size() - 1];
}
input.resize(dest);
}
At least in my view, the big thing to keep in mind here is the general pattern. You'll almost certainly run into a lot more situations where you want to filter some inputs to those that fit some criteria, and this basic pattern of tracking source and destination, and copying only those from the source to the destination that fit your criteria works well in a lot of situations, not just this one.
Generally one has to be very careful when deleting from containers while iterating over them. C++ STL can do this easily and faster (on average) than using nested loops.
#include <vector>
#include <algorithm>
#include <unordered_set>
int main() {
std::vector<int> c{1,2,2,4,5,5,6};
std::unordered_multiset<int> unique( c.begin(), c.end() );
c.erase(std::remove_if(c.begin(),c.end(),[&](const auto& e){return unique.count(e)>1;}),c.end());
for(auto e: c){
std::cout<<e<<' ';
}
}
//Output: 1 4 6
Alternatively, you could use std::map<int,std::size_t> and count the occurences this way.
Similarly to std::unique/std::copy_if, you might do:
void keep_unique(std::vector<int>& v){
auto it = v.begin();
auto w = v.begin();
while (it != v.end())
{
auto next = std::find_if(it, v.end(), [&](int e){ return e != *it; });
if (std::distance(it, next) == 1) {
if (w != it) {
*w = std::move(*it);
}
++w;
}
it = next;
}
v.erase(w, v.end());
}
Demo
Use std::remove_if to move items occurring multiple times to the rear, then erase them.
#include <iostream>
#include <vector>
#include <algorithm>
int main()
{
std::vector<int> V {1,2,2,4,5,5,6};
auto it = std::remove_if(V.begin(), V.end(), [&](const auto& val)
{
return std::count(V.begin(), V.end(), val) > 1;
});
V.erase(it, V.end());
for (const auto& val : V)
std::cout << val << std::endl;
return 0;
}
Output:
1
4
6
For demo: https://godbolt.org/z/j6fxe1
Iterating in reverse ensures an O(N) operation and does not cause element shifting when erasing because we are only ever erasing the last element in the vector. Also, no other data structures need to be allocated.
For every element encountered, we check if the adjacent element is equal, and if so, remove all instances of that element.
Requires the vector to be sorted, or at least grouped by duplicates.
#include <iostream>
#include <vector>
int main()
{
std::vector<int> c {1, 2, 2, 4, 5, 5, 6};
for (int i = c.size() - 1; i > 0;)
{
const int n = c[i];
if (c[i - 1] == n)
{
while (c[i] == n)
{
c.erase(c.begin() + i--);
}
}
else
{
i--;
}
}
//output result
for (auto it : c)
{
std::cout<<it;
}
std::cout << std::endl;
}
Output: 146
Update
An actual O(N) implementation using a sentinel value:
#include <iostream>
#include <vector>
#include <limits>
#include <algorithm>
int main()
{
std::vector<int> c { 1, 2, 2, 4, 5, 5, 6 };
const int sentinel = std::numeric_limits<int>::lowest(); //assumed that no valid member uses this value.
for (int i = 0; i < c.size() - 1;)
{
const int n = c[i];
if (c[i + 1] == n)
{
while (c[i] == n)
{
c[i++] = sentinel;
}
}
else
{
i++;
}
}
c.erase(std::remove(c.begin(),c.end(),sentinel), c.end());
for (auto it : c) std::cout<< it << ' ';
}
This can be achieved with the proper use of iterators to avoid runtime errors.
Have a look at the following code:
#include <iostream>
#include <vector>
int main() {
std::vector<int> c{1,2,2,4,5,5,6};
for (auto it = c.begin(); it != c.end(); ){
auto it2 = it;
advance(it2, 1);
bool isDuplicate = false;
for(; it2 != c.end(); ++it2){
if(*it == *it2){
c.erase(it2);
isDuplicate = true;
}
}
if(isDuplicate){
auto it3 = it;
advance(it3, 1);
c.erase(it);
it = it3;
}else{
it++;
}
}
for (auto it = c.begin(); it != c.end(); it++){
std::cout<<*it<<" ";
}
}
Output:
1 4 6

Sorting vector elements in descending order

Please tell me what is wrong in my approach.
When I run the code, it is taking too long to compute to see the result.
#include <iostream>
#include <vector>
using namespace std;
vector<int> vec;
vector<int> sort(vector<int> x) {
vector<int> y;
int i = 1;
reset:for(i = 1; i <= x.size(); i++){
for (int j = 1; j <= x.size();) {
if (j == i) {
j++;
}
else {
if (x[i - 1] > x[j - 1]) {
j++;
}
else {
i++;
goto reset;
}
}
}
y.push_back(x[i - 1]);
x.erase(x.begin() + i - 1);
}
return y;
}
int main(){
vec.push_back(5);
vec.push_back(9);
vec.push_back(3);
vec.push_back(6);
vec.push_back(2);
for (int i = 1; i <= vec.size(); i++) {
cout << sort(vec)[i-1] << " ";
}
}
I am sorting this given sequence of 5 integers into descending order. Please help.
My plan was to search for the greatest integer in the whole vector x and move to it to the vector y and repeat the process.
Simple bubble-sort example
I think that your sort function is entering an infinite loop because of the goto reset statement. If you want to implement a simple bubble-sort algorithm, you can do it like this:
#include <iostream>
#include <utility>
#include <vector>
void bubble_sort(std::vector<int>& v) {
if(v.size() == 0) return;
for(int max = v.size(); max > 0; max--) {
for(int i = 1; i < max; i++) {
int& current = v[i - 1];
int& next = v[i];
if(current < next)
std::swap(current, next);
}
}
}
This function takes a vector, and for every consecutive pair of elements in the vector, if they're out of order, it swaps them. This results in the smallest element "bubbling" to the top of the vector. The process is repeated until all the elements are in order.
If we test it, we see that it prints the right answer:
int main() {
std::vector<int> test = {5, 9, 3, 6, 2};
bubble_sort(test);
for(int i : test) {
std::cout << i << ' ';
}
std::cout << '\n';
}
Using std::sort to do this faster
The standard library provides a sort function that'll sort pretty much anything. std::sort is really well implemented, it's more efficient than bubble sort, and it's really easy to use.
By default, std::sort orders things in ascending order, although it's easy to change it so that it works in descending order. There are two ways to do this. The first way sorts the vector using the reverse iterators (which allow you to pretend the vector is in reverse order), and the second way sorts the vector using std::greater, which tells std::sort to sort things in reverse order.
// Way 1:
std::sort(test.rbegin(), test.rend());
// Way 2:
auto compare_func = std::greater<>();
std::sort(test.begin(), test.end(), compare_func);
We can re-write the program using std::sort:
#include <iostream>
#include <vector>
#include <algorithm>
int main() {
std::vector<int> test = {5, 9, 3, 6, 2};
auto compare_function = std::greater<>();
std::sort(test.begin(), test.end(), compare_function);
for(int i : test) {
std::cout << i << ' ';
}
std::cout << '\n';
}
Why can't you just use std:sort? You can do this:
sort(vec.begin(), vec.end(), [](const int a, const int b) {return a > b; }); //1
As suggested in the comments, there are two alternatives to the above:
std::sort(vec.begin(), vec.end(), std::greater<>()); //2
and:
std::sort(vec.rbegin(), vec.rend()); //3
(2) and (3) avoid a custom comparison function, and (2) is arguably more explicit about it's intent. But I was interested in the performance, and so I did a quick bench comparison of the three.
With Clang 12.0, (1) was fastest:
Clang results here
However, with GCC 10.3 all three were near identical:
GCC results here
Interesting results! With GCC, it's your choice as to which version you prefer; otherwise I would go for (1) or (2).

What's the most efficient way to print all elements of vector in ascending order till it's empty without duplicates?

I'm supposed to:
Print vector elements sorted without repetition.
Delete the elements that are printed from vector.
Repeat the the previous steps until vector is empty.
But it seems that my code takes more time so, I seek for optimisation. I've tried to do this task with std::vector and std::set.
Here is my approach:
#include <iostream>
#include <algorithm>
#include <vector>
#include <set>
using namespace std;
int main () {
int n;
cin >> n;
vector<int> v(n);
set<int> st;
for (int i = 0; i < n; i++) {
cin >> v[i];
}
while (!v.empty()) {
for (int i = 0; i < v.size(); i++)
st.insert(v[i]);
for (auto x : st) {
cout << x << ' ';
auto it = find(v.begin(), v.end(), x);
if (it != v.end())
v.erase(it);
}
st.clear();
cout << "\n";
}
return 0;
}
For example input is like:
7
1 2 3 3 2 4 3
Output gonna be like this:
1 2 3 4
2 3
3
You might use std::map instead of std::vector/std::set to keep track of numbers:
#include <iostream>
#include <map>
int main () {
map<int, int> m;
int size;
std::cin >> size;
for (int i = 0; i != size; i++) {
int number;
std::cin >> number;
++m[number];
}
while (!m.empty()) {
for (auto it = m.begin(); it != m.end(); /*Empty*/) {
const auto number = it->first;
auto& count = it->second;
std::cout << number << ' ';
if (--count == 0) {
it = m.erase(it);
} else {
++it;
}
}
std::cout << "\n";
}
}
Complexity is now O(n log(n)) instead of O(n²) (with lot of internal allocations).
Due to it overwriting the elements expected to be deleted, std::unique won't be much use for this problem. My solution:
std::sort(v.begin(), v.end());
while (!v.empty())
{
int last = v.front();
std::cout << last << " ";
v.erase(v.begin());
for (auto it = v.begin(); it != v.end(); /* no-op */)
{
if (*it == last)
{
++it;
}
else
{
last = *it;
std::cout << last << " ";
it = v.erase(it);
}
}
std::cout << std::endl;
}
You could probably improve performance further by reversing the sorting of the vector, and then iterating through backwards (since it's cheaper to delete from closer to the back of the vector), but that would complicate the code further, so I'll say "left as an exercise for the reader".
You can use std::map
auto n = 0;
std::cin >> n;
std::map<int, int> mp;
while (--n >= 0) {
auto i = 0;
std::cin >> i;
mp[i] += 1;
}
while (!mp.empty()) {
for (auto& it: mp) {
std::cout << it.first << " ";
it.second--;
}
for (auto it = mp.begin(); it != mp.end(); ++it) {
if (it->second == 0) mp.erase(it);
}
std::cout << "\n";
}
without any erase
auto n = 0;
std::cin >> n;
std::map<int, int> mp;
while (--n >= 0) {
auto i = 0;
std::cin >> i;
mp[i] += 1;
}
auto isDone = false;
while (!isDone) {
isDone = true;
for (auto& it: mp) {
if (it.second > 0) std::cout << it.first << " ";
if (--it.second > 0) isDone = false;
}
std::cout << "\n";
}
Here is a solution using sort and vector. It uses a second vector to hold the unique items and print them.
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
int main()
{
std::vector<int> v{1,2,3,3,2,4,3};
std::sort(v.begin(), v.end());
std::vector<int>::iterator vit;
while(!v.empty()){
std::vector<int> printer;
std::vector<int>::iterator pit;
vit = v.begin();
while (vit != v.end()){
pit = find(printer.begin(), printer.end(), *vit);
if (pit == printer.end()){
printer.push_back(*vit);
vit = v.erase(vit);
} else {
++vit;
}
}
std::copy(printer.begin(), printer.end(), std::ostream_iterator<int>(std::cout, " "));
std::cout << '\n';
}
}
Output:
1 2 3 4
2 3
3
It's not clear (at least to me) exactly what you're talking about when you mention "efficiency". Some people use it to refer solely to computational complexity. Others think primarily in terms of programmer's time, while still others think of overall execution speed, regardless of whether that's obtained via changes in computational complexity, or (for one example) improved locality of reference leading to better cache utilization.
So, with that warning, I'm not sure whether this really improves what you care about or not, but it's how I think I'd do the job anyway:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
// preconditions: input range is sorted
template <class BidiIt>
BidiIt partition_unique(BidiIt begin, BidiIt end) {
auto pivot = end;
for (auto pos = begin; pos != pivot; ++pos) {
auto mid = std::next(pos);
for ( ; mid < pivot && *mid == *pos; ++mid, --pivot)
;
std::rotate(std::next(pos), mid, end);
}
return pivot;
}
template <class It>
void show(It b, It e, std::ostream &os) {
while (b != e) {
os << *b << ' ';
++b;
}
os << '\n';
}
int main() {
std::vector<int> input{ 1, 2, 3, 3, 2, 4, 3 };
std::sort(input.begin(), input.end());
auto begin = input.begin();
auto pos = begin;
while ((pos = partition_unique(begin, input.end())) != input.end()) {
show(begin, pos, std::cout);
begin = pos;
}
show(begin, input.end(), std::cout);
}
I'm not really sure it's possible to improve the computational complexity much over what this does (but it might be--I haven't thought about it enough to be sure one way or the other). Compared to some versions I see posted already, there's a decent chance this will improve overall speed (e.g., since it just moves things around inside the same vector, it's likely to get better locality than those that copy data from one vector to another.
The code is in java but the idea remains the same.
At first, I sort the array. Now, the idea is to create buckets.
This means that each line of sorted elements is like a bucket. So, find the count of each element. Now, put that element into each bucket, count number of times. If it so happens that bucket size is less, create a new bucket and add the current element to it.
In the end, print all buckets.
Time Complexity is O(nlog(n)) for sorting and O(n) for the buckets since you have to visit each and every element to print it. So, it's O(nlog(n)) + O(n) = O(nlog(n)) asymptotically.
Code:
import java.util.*;
public class GFG {
public static void main(String[] args){
int[] arr1 = {1,2,3,3,2,4,3};
int[] arr2 = {45,98,65,32,65,74865};
int[] arr3 = {100,100,100,100,100};
int[] arr4 = {100,200,300,400,500};
printSeries(compute(arr1,arr1.length));
printSeries(compute(arr2,arr2.length));
printSeries(compute(arr3,arr3.length));
printSeries(compute(arr4,arr4.length));
}
private static void printSeries(List<List<Integer>> res){
int size = res.size();
for(int i=0;i<size;++i){
System.out.println(res.get(i).toString());
}
}
private static List<List<Integer>> compute(int[] arr,int N){
List<List<Integer>> buckets = new ArrayList<List<Integer>>();
Arrays.sort(arr);
int bucket_size = 0;
for(int i=0;i<N;++i){
int last_index = i;
if(bucket_size > 0){
buckets.get(0).add(arr[i]);
}else{
buckets.add(newBucket(arr[i]));
bucket_size++;
}
for(int j=i+1;j<N;++j){
if(arr[i] != arr[j]) break;
if(j-i < bucket_size){
buckets.get(j-i).add(arr[i]);
}else{
buckets.add(newBucket(arr[i]));
bucket_size++;
}
last_index = j;
}
i = last_index;
}
return buckets;
}
private static List<Integer> newBucket(int value){
List<Integer> new_bucket = new ArrayList<>();
new_bucket.add(value);
return new_bucket;
}
}
OUTPUT
[1, 2, 3, 4]
[2, 3]
[3]
[32, 45, 65, 98, 74865]
[65]
[100]
[100]
[100]
[100]
[100]
[100, 200, 300, 400, 500]
This is what i came up with:
http://coliru.stacked-crooked.com/a/b3f06693a74193e5
The key idea:
sort vector
print by iterating. just print a value if it differs from last printed
remove unique elements. i have done this with what i called inverse_unique. the std library comes with an algorithm called unique, which will remove all duplicates. i inverted this so that it will just keep all dublicates.
so we have no memory allocation at all. i cant see how one could make the algorithm more efficient. we are just doing the bare minimum and its exactly done the way a human thinks about.
i tested it with several combinations. hope its bug free ;-P
code:
#include <iostream>
#include <algorithm>
#include <vector>
template<class ForwardIt>
ForwardIt inverse_unique(ForwardIt first, ForwardIt last)
{
if (first == last)
return last;
auto one_ahead = first+1;
auto dst = first;
while(one_ahead != last)
{
if(*first == *one_ahead)
{
*dst = std::move(*first);
++dst;
}
++first;
++one_ahead;
}
return dst;
}
void print_unique(std::vector<int> const& v)
{
if(v.empty()) return;
// print first
std::cout << v[0] << ' ';
auto last_printed = v.cbegin();
// print others
for(auto it = std::next(std::cbegin(v)); it != std::cend(v); ++it)
{
if(*it != *last_printed)
{
std::cout << *it << ' ';
last_printed = it;
}
}
std::cout << "\n";
}
void remove_uniques(std::vector<int> & v)
{
auto new_end = inverse_unique(std::begin(v), std::end(v));
v.erase(new_end, v.end());
}
int main ()
{
std::vector<int> v = {1, 2, 3, 3, 2, 4, 3};
std::sort(std::begin(v), std::end(v));
while (!v.empty())
{
print_unique(v);
remove_uniques(v);
}
return 0;
}
Edit: updated inverse_unique function. should be easy to understand now.
Half baked at http://coliru.stacked-crooked.com/a/c45df1591d967075
Slightly modified counting sort.
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
#include <map>
int main() {
std::vector<int> v{1,2,3,3,2,4,3};
std::map<int, int> map;
for (auto x : v)
++map[x];
while(map.size()) {
for(auto pair = map.begin(); pair != map.end(); ) {
std::cout << pair->first << ' ';
if (!--pair->second)
pair = map.erase(pair);
else
++pair;
}
std::cout << "\n";
}
return 0;
}

how can I find repeated elements in a vector [duplicate]

This question already has answers here:
Checking for duplicates in a vector [duplicate]
(5 answers)
Closed 9 years ago.
I have a vector of int which can include maximum 4 elements and minimum 2, for example :
std::vector<int> vectorDATA(X); // x means unknown here
What I want to do is to erase the elements that are repeated for example :
vectorDATA{1,2,2} to vectorDATA{1,2}
vectorDATA{1,2,3} to nothing changes
vectorDATA{2,2,2} to vectorDATA{2}
vectorDATA{3,2,1,3} to vectorDATA{3,2,1}
vectorDATA{1,2,1,2} to vector{1,2}
and so on
here a code simple :
cv::HoughLines(canny,lineQ,1,CV_PI/180,200);
std::cout << " line Size "<<lineQ.size()<< std::endl;
std::vector<int> linesData(lineQ.size());
std::vector<int> ::iterator it;
if(lineQ.size() <=4 && lineQ.size() !=0 ){
if(lineQ.size()==1){
break;
}else {
for ( int i = 0; i<lineQ.size();i++){
linesData[i] = lineQ[i][1]; // my comparison parameter is the lineQ[i][1]
}
// based on the answer I got I'm trying this but I really don't how to continue ?
std::sort(lineQ.begin(),lineQ.end(),[](const cv::Vec2f &a,const cv::Vec2f &b)
{
return ????
}
I tried use a for and do while loop, but I didn't get it, and the function std::adjacent_find this has a condition that the elements should be consecutive.
Maybe it's easy but I just don't get it !
thanks for any help !
The easy way is sort then unique-erase, but this changes order.
The c++11 order preserving way is to create an unordered_set<int> s; and do:
unordered_set<int> s;
vec.erase(
std::remove_if( vec.begin(),vec.end(), // remove from vector
[&](int x)->bool{
return !std::get<1>(s.insert(x)); // true iff the item was already in the set
}
),
vec.end() // erase from the end of kept elements to the end of the `vec`
);
which is the remove-erase idiom using the unordered_set to detect duplicates.
I didn't see a sort-less source code in the already mentioned answers, so here it goes. Hash table for checking duplicates, shifting unique elements towards the front of the vector, note that src is always >= dst and dst is the number of copied, i.e. unique elements at the end.
#include <unordered_set>
#include <vector>
#include <iostream>
void
uniq (std::vector<int> &a) {
std::unordered_set<int> s;
size_t dst = 0;
for (size_t src = 0; src < a.size(); ++src) {
if (s.count (a[src]) == 0) {
s.insert (a[src]);
a[dst++] = a[src];
}
}
a.resize (dst);
}
int
main () {
std::vector<int> a = { 3, 2, 1, 3, 2, 1, 2, 3, 4, 5 ,2, 3, 1, 1 };
uniq (a);
for (auto v : a)
std::cout<< v << " ";
std::cout << std::endl;
}
If you want to realy remove repeated elements, you may try something like this:
#include <iostream>
#include <algorithm>
#include <vector>
using namespace std;
int main () {
int data[] = {1,2,3,2,1};
vector<int> vectorDATA = (&data[0], &data[0] + 5);
sort(vectorDATA.begin(),vectorDATA.end());
for(int i = 0; i < vectorDATA.size()-1; ++i)
{
if(vectorDATA[i] == vectorDATA[i+1])
vectorDATA.erase(vectorDATA.begin()+i+1);
}
for(int i = 0; i < vectorDATA.size();++i)
{
cout << vectorDATA[i] << " ";
}
cout << endl;
return 0;
}
Lack of of this method is then elements lost his order.