Split vector to unique and duplicates c++

Split vector to unique and duplicates c++ - c++

My goal is to split a vector into two parts: with unique values and with duplicates.
For example I have sorted vector myVec=(1,1,3,4,4,7,7,8,9,9) which should be split into myVecDuplicates=(1,7,4,9) and myVecUnique=(1,4,7,9,3,8). So myVecDuplicates contains all values that have duplicates while myVecUnique contains all values but in a single embodiment.
The order does not matter. My idea was to use unique as it splits a vector into two parts. But I have a problem running my code.
vector<int> myVec(8)={1,1,3,4,4,7,8,9};
vector<int>::iterator firstDuplicate=unique(myVec.begin(),myVec.end());
vector<int> myVecDuplicate=myVec(firstDuplicate,myVec.end());\\here error accures that says ' no match for call to '(std::vector<int>) (std::vector<int>::iterator&, std::vector<int>::iterator)'
vector<int> myVecUnique=myVec(myVec.begin()+firstDuplicate-1,myVec.end());
After running this code I get an error that says (2nd line) 'no match for call to '(std::vector) (std::vector::iterator&, std::vector::iterator)'
Please help me to understand the source of error or maybe suggest some more elegant and fast way to solve my problem (without hash tables)!

Ahh..Too many edits in your question for anyone's liking. Just keep it simple by using map.
In C++, map comes really handy in storing the unique + sorted + respective_count values.
map<int, int> m;
for(auto &t : myVec){
m[t]++;
}
vector<int> myVecDuplicate, myVecUnique;
for(map<int, int>::iterator it = m.begin(); it != m.end(); it++){
if(it->second > 1) myVecDuplicate.push_back(it->first);
myVecUnique.push_back(it->first);
}
Edit:
maybe suggest some more elegant and fast way to solve my problem (without hash tables)!
Sort the vector
Traverse through the sorted vector,
and do
if (current_value == previous_value){
if(previous_value != previous_previous_value)
myVecDuplicate.push_back(current_value);
}
else{
myVecUnique.push_back(current_value);
}
To start, initialize previous_value = current_value - 1
and previous_previous_value as current_value - 2.

While this may be frowned upon (for not using standard algorithms and such), I would write some simple solution like this:
vector<int> myVec = {1,1,3,4,4,7,8,9};
unordered_set<int> duplicates;
unordered_set<int> unique;
for(int & v : myVec)
{
if(unique.count(v) > 0)
duplicates.insert(v);
else
unique.insert(v);
}

O(n) complexity solution:
#include <iostream>
#include <vector>
int main()
{
std::vector<int> myVec = {1,1,3,4,4,7,7,8,9,9};
std::vector<int> myVecDuplicatec;
std::vector<int> myVecUnique;
for(int &x : myVec)
{
if(myVecUnique.size() == 0 || myVecUnique.back() != x)
myVecUnique.push_back(x);
else
myVecDuplicatec.push_back(x);
}
std::cout << "V = ";
for(int &x : myVec)
{
std::cout << x << ",";
}
std::cout << std::endl << "U = ";
for(int &x : myVecUnique)
{
std::cout << x << ",";
}
std::cout << std::endl << "D = ";
for(int &x : myVecDuplicatec)
{
std::cout << x << ",";
}
}
cpp.sh/4i45x

std::vector has a constructor that accepts 2 iterators for range [first,second[ You cannot call constructor for existing object - it is already created, so your code
myVec(firstDuplicate,myVec.end());
actually tries to use myVec as a functor, but std::vector does not have operator() hence the error.
you have 2 ways, pass 2 iterators to constructor directly:
vector<int> myVecDuplicate(firstDuplicate,myVec.end());
or use copy initialization with temporary vector:
vector<int> myVecDuplicate = vector<int>(firstDuplicate,myVec.end());
Same for the second vector:
vector<int> myVecUnique(myVec.begin(),firstDuplicate);
as pointed by Logman std::unique does not seem to guarantee value of duplicates, so working solution can use std::set instead (and you would not have to presort source vector):
std::set<int> iset;
vector<int> myVecUnique, myVecDuplicate;
for( auto val : myVec )
( iset.insert( val ).second ? myVecUnique : myVecDuplicate ).push_back( val );

Related

Trying to understand it better. What is the difference between these two implementations?

These two example both work and do the same thing.
I'm just trying to get what is the difference between them in terms of optimization, speed and overall. Which approach is better and why? Thanks in advance.
First example:
std::map<std::vector<int>, std::vector<double>> data
printMap(&data);
...
void printMap(std::map<std::vector<int>, std::vector<double>> *p_data){
for(std::map<std::vector<int>, std::vector<double>>::iterator itr = p_data->begin(); itr != p_data->end(); ++itr){
for(auto it = itr->first.begin(); it != itr->first.end(); ++it){
std::cout << *it << std::endl;
}
for(auto it2 = itr->second.begin(); it2 != itr->second.end(); ++it2){
std::cout << *it2 << std::endl;
}
}
}
Second example:
std::map<std::vector<int>, std::vector<double>> data;
printMapRef(data);
void printMapRef(std::map<std::vector<int>,std::vector<double>> &data){
for(std::map<std::vector<int>, std::vector<double>>::iterator itr = data.begin(); itr != data.end(); ++itr){
std::vector<int> tempVecInt = (*itr).first;
std::vector<double> tempVecDouble = (*itr).second;
for (int i = 0; i < tempVecInt.size(); i++){
std::cout << tempVecInt.at(i) << " ";
}
for (int j = 0; j < tempVecDouble.size(); j++){
std::cout << tempVecDouble.at(j) << " ";
}
}
}

The obvious difference is that the first iterates through the vectors that are in the map, while the second creates copies of the vectors in the map, then iterates through the copies.
The second also uses .at to index into each vector, which checks that the index is within bounds (and throws an exception if it isn't).
Especially if the vectors are large, those could easily make the second significantly slower than the first.
Most of the other differences are mostly syntactic. Personally I don't particularly like the syntax of the iterator-based loop, but iterators vs. indices is unlikely to make any real difference in speed or anything like that.
For what little it's worth, my own preference would be to pass the map in by (const) reference and use range-based for loops. I'd also at least consider using a function to print out the contents of each vector, since you have two loops that should be essentially identical.

Sorted element container which can be accessed by index [duplicate]

I have a set of type set<int> and I want to get an iterator to someplace that is not the beginning.
I am doing the following:
set<int>::iterator it = myset.begin() + 5;
I am curious why this is not working and what is the correct way to get an iterator to where I want it.

myset.begin() + 5; only works for random access iterators, which the iterators from std::set are not.
For input iterators, there's the function std::advance:
set<int>::iterator it = myset.begin();
std::advance(it, 5); // now it is advanced by five
In C++11, there's also std::next which is similar but doesn't change its argument:
auto it = std::next(myset.begin(), 5);
std::next requires a forward iterator. But since std::set<int>::iterator is a bidirectional iterator, both advance and next will work.

The operator+ doesn’t define for this structure and only It make sense for random access iterators.
First solution:
You can use std::advance, the function uses repeatedly the increase or decrease operator (operator++ or operator--) until n elements have been advanced.
set<int>::iterator it = myset.begin();
std::advance(it, 5);
std::out << *it << std::endl; // == it + 5
Second solution:
Use std::next or std::prev functions,The performance same as the old one because uses repeatedly the increase or decrease operator (operator++ or operator--)until n element have been advanced.
Note: If it is a random access iterator, the function just uses just
once operator+ or operator-.
set<int>::iterator it1 = myset.begin();
std::next(it1, 5); // == it1 + 5
std::out << *it1 << std::endl; // == it1 + 5
set<int>::iterator it2 = myset.end();
std::prev(it2, 5); // == it2 - 5
std::out << *it2 << std::endl; // == it2 - 5
Note: If you want to access, vectors are very efficient accessing its elements (just like arrays) and relatively efficient adding or removing elements from its end.

Get element at index from C++11 std::set
std::set in C++ has no getter by index so you'll have to roll your own by iterating the list yourself and copying into an array then indexing that.
For example:
#include<iostream>
#include<set>
using namespace std;
int main(){
set<int> uniqueItems; //instantiate a new empty set of integers
uniqueItems.insert(10);
uniqueItems.insert(20); //insert three values into the set
uniqueItems.insert(30);
int myarray[uniqueItems.size()]; //create an int array of same size as the
//set<int> to accomodate elements
int i = 0;
for (const int &num : uniqueItems){ //iterate over the set
myarray[i] = num; //assign it to the appropriate array
i++; //element and increment
}
cout << myarray[0] << endl; //get index at zero, prints 10
cout << myarray[1] << endl; //get index at one, prints 20
cout << myarray[2] << endl; //get index at two, prints 30
}
Or a handy dandy function to step through then return the right one:
int getSetAtIndex(set<int> myset, int index){
int i = 0;
for (const int &num : myset){ //iterate over the set
if (i++ == index){
return num;
}
}
string msg = "index " + to_string(index) + \
"is out of range";
cout << msg;
exit(8);
}
int main(){
set<int> uniqueItems; //instantiate a new empty set of integers
uniqueItems.insert(10);
uniqueItems.insert(20); //insert three values into the set
uniqueItems.insert(30);
cout << getSetAtIndex(uniqueItems, 1);
}

How to find a unique number using std::find

Hey here is a trick question asked in class today, I was wondering if there is a way to find a unique number in a array, The usual method is to use two for loops and get the unique number which does not match with all the others I am using std::vectors for my array in C++ and was wondering if find could spot the unique number as I wouldn't know where the unique number is in the array.

Assuming that we know that the vector has at least three
elements (because otherwise, the question doesn't make sense),
just look for an element different from the first. If it
happens to be the second, of course, we have to check the third
to see whether it was the first or the second which is unique,
which means a little extra code, but roughly:
std::vector<int>::const_iterator
findUniqueEntry( std::vector<int>::const_iterator begin,
std::vector<int>::const_iterator end )
{
std::vector<int>::const_iterator result
= std::find_if(
next( begin ), end, []( int value) { return value != *begin );
if ( result == next( begin ) && *result == *next( result ) ) {
-- result;
}
return result;
}
(Not tested, but you get the idea.)

As others have said, sorting is one option. Then your unique value(s) will have a different value on either side.
Here's another option that solves it, using std::find, in O(n^2) time(one iteration of the vector, but each iteration iterates through the whole vector, minus one element.) - sorting not required.
vector<int> findUniques(vector<int> values)
{
vector<int> uniqueValues;
vector<int>::iterator begin = values.begin();
vector<int>::iterator end = values.end();
vector<int>::iterator current;
for(current = begin ; current != end ; current++)
{
int val = *current;
bool foundBefore = false;
bool foundAfter = false;
if (std::find(begin, current, val) != current)
{
foundBefore = true;
}
else if (std::find(current + 1, end, val) != end)
{
foundAfter = true;
}
if(!foundBefore && !foundAfter)
uniqueValues.push_back(val);
}
return uniqueValues;
}
Basically what is happening here, is that I am running ::find on the elements in the vector before my current element, and also running ::find on the elements after my current element. Since my current element already has the value stored in 'val'(ie, it's in the vector once already), if I find it before or after the current value, then it is not a unique value.
This should find all values in the vector that are not unique, regardless of how many unique values there are.
Here's some test code to run it and see:
void printUniques(vector<int> uniques)
{
vector<int>::iterator it;
for(it = uniques.begin() ; it < uniques.end() ; it++)
{
cout << "Unique value: " << *it << endl;
}
}
void WaitForKey()
{
system("pause");
}
int main()
{
vector<int> values;
for(int i = 0 ; i < 10 ; i++)
{
values.push_back(i);
}
/*for(int i = 2 ; i < 10 ; i++)
{
values.push_back(i);
}*/
printUniques(findUniques(values));
WaitForKey();
return -13;
}
As an added bonus:
Here's a version that uses a map, does not use std::find, and gets the job done in O(nlogn) time - n for the for loop, and log(n) for map::find(), which uses a red-black tree.
map<int,bool> mapValues(vector<int> values)
{
map<int, bool> uniques;
for(unsigned int i = 0 ; i < values.size() ; i++)
{
uniques[values[i]] = (uniques.find(values[i]) == uniques.end());
}
return uniques;
}
void printUniques(map<int, bool> uniques)
{
cout << endl;
map<int, bool>::iterator it;
for(it = uniques.begin() ; it != uniques.end() ; it++)
{
if(it->second)
cout << "Unique value: " << it->first << endl;
}
}
And an explanation. Iterate over all elements in the vector<int>. If the current member is not in the map, set its value to true. If it is in the map, set the value to false. Afterwards, all values that have the value true are unique, and all values with false have one or more duplicates.

If you have more than two values (one of which has to be unique), you can do it in O(n) in time and space by iterating a first time through the array and filling a map that has as a key the value, and value the number of occurences of the key.
Then you just have to iterate through the map in order to find a value of 1. That would be a unique number.

This example uses a map to count number occurences. Unique number will be seen only one time:
#include <iostream>
#include <map>
#include <vector>
int main ()
{
std::map<int,int> mymap;
std::map<int,int>::iterator mit;
std::vector<int> v;
std::vector<int> myunique;
v.push_back(10); v.push_back(10);
v.push_back(20); v.push_back(30);
v.push_back(40); v.push_back(30);
std::vector<int>::iterator vit;
// count occurence of all numbers
for(vit=v.begin();vit!=v.end();++vit)
{
int number = *vit;
mit = mymap.find(number);
if( mit == mymap.end() )
{
// there's no record in map for your number yet
mymap[number]=1; // we have seen it for the first time
} else {
mit->second++; // thiw one will not be unique
}
}
// find the unique ones
for(mit=mymap.begin();mit!=mymap.end();++mit)
{
if( mit->second == 1 ) // this was seen only one time
{
myunique.push_back(mit->first);
}
}
// print out unique numbers
for(vit=myunique.begin();vit!=myunique.end();++vit)
std::cout << *vit << std::endl;
return 0;
}
Unique numbers in this example are 20 and 40. There's no need for the list to be ordered for this algorithm.

Do you mean to find a number in a vector which appears only once? The nested loop if the easy solution. I don't think std::find or std::find_if is very useful here. Another option is to sort the vector so that you only need to find two consecutive numbers that are different. It seems overkill, but it is actually O(nlogn) instead of O(n^2) as the nested loop:
void findUnique(const std::vector<int>& v, std::vector<int> &unique)
{
if(v.size() <= 1)
{
unique = v;
return;
}
unique.clear();
vector<int> w = v;
std::sort(w.begin(), w.end());
if(w[0] != w[1]) unique.push_back(w[0]);
for(size_t i = 1; i < w.size(); ++i)
if(w[i-1] != w[i]) unique.push_back(w[i]);
// unique contains the numbers that are not repeated
}

Assuming you are given an array size>=3 which contains one instance of value A, and all other values are B, then you can do this with a single for loop.
int find_odd(int* array, int length) {
// In the first three elements, we are guaranteed to have 2 common ones.
int common=array[0];
if (array[1]!=common && array[2]!=common)
// The second and third elements are the common one, and the one we thought was not.
return common;
// Now search for the oddball.
for (int i=0; i<length; i++)
if (array[i]!=common) return array[i];
}
EDIT:
K what if more than 2 in an array of 5 are different? – super
Ah... that is a different problem. So you have an array of size n, which contains the common element c more than once, and all other elements exactly once. The goal is to find the set of non-common (i.e. unique) elements right?
Then you need to look at Sylvain's answer above. I think he was answering a different question, but it would work for this. At the end, you will have a hash map full of the counts of each value. Loop through the hash map, and every time you see a value of 1, you will know the key is a unique value in the input array.

Is there a sorted container in the STL?

Is there a sorted container in the STL?
What I mean is following: I have an std::vector<Foo>, where Foo is a custom made class. I also have a comparator of some sort which will compare the fields of the class Foo.
Now, somewhere in my code I am doing:
std::sort( myvec.begin(), myvec.end(), comparator );
which will sort the vector according to the rules I defined in the comparator.
Now I want to insert an element of class Foo into that vector. If I could, I would like to just write:
mysortedvector.push_back( Foo() );
and what would happen is that the vector will put this new element according to the comparator to its place.
Instead, right now I have to write:
myvec.push_back( Foo() );
std::sort( myvec.begin(), myvec.end(), comparator );
which is just a waste of time, since the vector is already sorted and all I need is to place the new element appropriately.
Now, because of the nature of my program, I can't use std::map<> as I don't have a key/value pairs, just a simple vector.
If I use stl::list, I again need to call sort after every insertion.

Yes, std::set, std::multiset, std::map, and std::multimap are all sorted using std::less as the default comparison operation. The underlying data-structure used is typically a balanced binary search tree such as a red-black tree. So if you add an element to these data-structures and then iterate over the contained elements, the output will be in sorted order. The complexity of adding N elements to the data-structure will be O(N log N), or the same as sorting a vector of N elements using any common O(log N) complexity sort.
In your specific scenario, since you don't have key/value pairs, std::set or std::multiset is probably your best bet.

I'd like to expand on Jason's answer. I agree to Jason, that either std::set or std::multiset is the best choice for your specific scenario. I'd like to provide an example in order to help you to further narrow down the choice.
Let's assume that you have the following class Foo:
class Foo {
public:
Foo(int v1, int v2) : val1(v1), val2(v2) {};
bool operator<(const Foo &foo) const { return val2 < foo.val2; }
int val1;
int val2;
};
Here, Foo overloads the < operator. This way, you don't need to specify an explicit comparator function. As a result, you can simply use a std::multiset instead of a std::vector in the following way. You just have to replace push_back() by insert():
int main()
{
std::multiset<Foo> ms;
ms.insert(Foo(1, 6));
ms.insert(Foo(1, 5));
ms.insert(Foo(3, 4));
ms.insert(Foo(2, 4));
for (auto const &foo : ms)
std::cout << foo.val1 << " " << foo.val2 << std::endl;
return 0;
}
Output:
3 4
2 4
1 5
1 6
As you can see, the container is sorted by the member val2 of the class Foo, based on the < operator. However, if you use std::set instead of a std::multiset, then you will get a different output:
int main()
{
std::set<Foo> s;
s.insert(Foo(1, 6));
s.insert(Foo(1, 5));
s.insert(Foo(3, 4));
s.insert(Foo(2, 4));
for (auto const &foo : s)
std::cout << foo.val1 << " " << foo.val2 << std::endl;
return 0;
}
Output:
3 4
1 5
1 6
Here, the second Foo object where val2 is 4 is missing, because a std::set only allows for unique entries. Whether entries are unique is decided based on the provided < operator. In this example, the < operator compares the val2 members to each other. Therefore, two Foo objects are equal, if their val2 members have the same value.
So, your choice depends on whether or not you want to store Foo objects that may be equal based on the < operator.
Code on Ideone

C++ do have sorted container e.g std::set and std::map
int main()
{
//ordered set
set<int> s;
s.insert(5);
s.insert(1);
s.insert(6);
s.insert(3);
s.insert(7);
s.insert(2);
cout << "Elements of set in sorted order: ";
for (auto it : s)
cout << it << " ";
return 0;
}
Output:
Elements of set in sorted order:
1 2 3 5 6 7
int main()
{
// Ordered map
std::map<int, int> order;
// Mapping values to keys
order[5] = 10;
order[3] = 5;
order[20] = 100;
order[1] = 1;
// Iterating the map and printing ordered values
for (auto i = order.begin(); i != order.end(); i++) {
std::cout << i->first << " : " << i->second << '\n';
}
Output:
1 : 1
3 : 5
5 : 10
20 : 100

How to navigate through a vector using iterators? (C++)

The goal is to access the "nth" element of a vector of strings instead of the [] operator or the "at" method. From what I understand, iterators can be used to navigate through containers, but I've never used iterators before, and what I'm reading is confusing.
If anyone could give me some information on how to achieve this, I would appreciate it. Thank you.

You need to make use of the begin and end method of the vector class, which return the iterator referring to the first and the last element respectively.
using namespace std;
vector<string> myvector; // a vector of stings.
// push some strings in the vector.
myvector.push_back("a");
myvector.push_back("b");
myvector.push_back("c");
myvector.push_back("d");
vector<string>::iterator it; // declare an iterator to a vector of strings
int n = 3; // nth element to be found.
int i = 0; // counter.
// now start at from the beginning
// and keep iterating over the element till you find
// nth element...or reach the end of vector.
for(it = myvector.begin(); it != myvector.end(); it++,i++ ) {
// found nth element..print and break.
if(i == n) {
cout<< *it << endl; // prints d.
break;
}
}
// other easier ways of doing the same.
// using operator[]
cout<<myvector[n]<<endl; // prints d.
// using the at method
cout << myvector.at(n) << endl; // prints d.

In C++-11 you can do:
std::vector<int> v = {0, 1, 2, 3, 4, 5};
for (auto i : v)
{
// access by value, the type of i is int
std::cout << i << ' ';
}
std::cout << '\n';
See here for variations: https://en.cppreference.com/w/cpp/language/range-for

Typically, iterators are used to access elements of a container in linear fashion; however, with "random access iterators", it is possible to access any element in the same fashion as operator[].
To access arbitrary elements in a vector vec, you can use the following:
vec.begin() // 1st
vec.begin()+1 // 2nd
// ...
vec.begin()+(i-1) // ith
// ...
vec.begin()+(vec.size()-1) // last
The following is an example of a typical access pattern (earlier versions of C++):
int sum = 0;
using Iter = std::vector<int>::const_iterator;
for (Iter it = vec.begin(); it!=vec.end(); ++it) {
sum += *it;
}
The advantage of using iterator is that you can apply the same pattern with other containers:
sum = 0;
for (Iter it = lst.begin(); it!=lst.end(); ++it) {
sum += *it;
}
For this reason, it is really easy to create template code that will work the same regardless of the container type.
Another advantage of iterators is that it doesn't assume the data is resident in memory; for example, one could create a forward iterator that can read data from an input stream, or that simply generates data on the fly (e.g. a range or random number generator).
Another option using std::for_each and lambdas:
sum = 0;
std::for_each(vec.begin(), vec.end(), [&sum](int i) { sum += i; });
Since C++11 you can use auto to avoid specifying a very long, complicated type name of the iterator as seen before (or even more complex):
sum = 0;
for (auto it = vec.begin(); it!=vec.end(); ++it) {
sum += *it;
}
And, in addition, there is a simpler for-each variant:
sum = 0;
for (auto value : vec) {
sum += value;
}
And finally there is also std::accumulate where you have to be careful whether you are adding integer or floating point numbers.

Vector's iterators are random access iterators which means they look and feel like plain pointers.
You can access the nth element by adding n to the iterator returned from the container's begin() method, or you can use operator [].
std::vector<int> vec(10);
std::vector<int>::iterator it = vec.begin();
int sixth = *(it + 5);
int third = *(2 + it);
int second = it[1];
Alternatively you can use the advance function which works with all kinds of iterators. (You'd have to consider whether you really want to perform "random access" with non-random-access iterators, since that might be an expensive thing to do.)
std::vector<int> vec(10);
std::vector<int>::iterator it = vec.begin();
std::advance(it, 5);
int sixth = *it;

Here is an example of accessing the ith index of a std::vector using an std::iterator within a loop which does not require incrementing two iterators.
std::vector<std::string> strs = {"sigma" "alpha", "beta", "rho", "nova"};
int nth = 2;
std::vector<std::string>::iterator it;
for(it = strs.begin(); it != strs.end(); it++) {
int ith = it - strs.begin();
if(ith == nth) {
printf("Iterator within a for-loop: strs[%d] = %s\n", ith, (*it).c_str());
}
}
Without a for-loop
it = strs.begin() + nth;
printf("Iterator without a for-loop: strs[%d] = %s\n", nth, (*it).c_str());
and using at method:
printf("Using at position: strs[%d] = %s\n", nth, strs.at(nth).c_str());

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Split vector to unique and duplicates c++ - c++

Related

Trying to understand it better. What is the difference between these two implementations?

Sorted element container which can be accessed by index [duplicate]

How to find a unique number using std::find

Is there a sorted container in the STL?

How to navigate through a vector using iterators? (C++)

Categories

Resources