How to find a unique number using std::find - c++

Hey here is a trick question asked in class today, I was wondering if there is a way to find a unique number in a array, The usual method is to use two for loops and get the unique number which does not match with all the others I am using std::vectors for my array in C++ and was wondering if find could spot the unique number as I wouldn't know where the unique number is in the array.

Assuming that we know that the vector has at least three
elements (because otherwise, the question doesn't make sense),
just look for an element different from the first. If it
happens to be the second, of course, we have to check the third
to see whether it was the first or the second which is unique,
which means a little extra code, but roughly:
std::vector<int>::const_iterator
findUniqueEntry( std::vector<int>::const_iterator begin,
std::vector<int>::const_iterator end )
{
std::vector<int>::const_iterator result
= std::find_if(
next( begin ), end, []( int value) { return value != *begin );
if ( result == next( begin ) && *result == *next( result ) ) {
-- result;
}
return result;
}
(Not tested, but you get the idea.)

As others have said, sorting is one option. Then your unique value(s) will have a different value on either side.
Here's another option that solves it, using std::find, in O(n^2) time(one iteration of the vector, but each iteration iterates through the whole vector, minus one element.) - sorting not required.
vector<int> findUniques(vector<int> values)
{
vector<int> uniqueValues;
vector<int>::iterator begin = values.begin();
vector<int>::iterator end = values.end();
vector<int>::iterator current;
for(current = begin ; current != end ; current++)
{
int val = *current;
bool foundBefore = false;
bool foundAfter = false;
if (std::find(begin, current, val) != current)
{
foundBefore = true;
}
else if (std::find(current + 1, end, val) != end)
{
foundAfter = true;
}
if(!foundBefore && !foundAfter)
uniqueValues.push_back(val);
}
return uniqueValues;
}
Basically what is happening here, is that I am running ::find on the elements in the vector before my current element, and also running ::find on the elements after my current element. Since my current element already has the value stored in 'val'(ie, it's in the vector once already), if I find it before or after the current value, then it is not a unique value.
This should find all values in the vector that are not unique, regardless of how many unique values there are.
Here's some test code to run it and see:
void printUniques(vector<int> uniques)
{
vector<int>::iterator it;
for(it = uniques.begin() ; it < uniques.end() ; it++)
{
cout << "Unique value: " << *it << endl;
}
}
void WaitForKey()
{
system("pause");
}
int main()
{
vector<int> values;
for(int i = 0 ; i < 10 ; i++)
{
values.push_back(i);
}
/*for(int i = 2 ; i < 10 ; i++)
{
values.push_back(i);
}*/
printUniques(findUniques(values));
WaitForKey();
return -13;
}
As an added bonus:
Here's a version that uses a map, does not use std::find, and gets the job done in O(nlogn) time - n for the for loop, and log(n) for map::find(), which uses a red-black tree.
map<int,bool> mapValues(vector<int> values)
{
map<int, bool> uniques;
for(unsigned int i = 0 ; i < values.size() ; i++)
{
uniques[values[i]] = (uniques.find(values[i]) == uniques.end());
}
return uniques;
}
void printUniques(map<int, bool> uniques)
{
cout << endl;
map<int, bool>::iterator it;
for(it = uniques.begin() ; it != uniques.end() ; it++)
{
if(it->second)
cout << "Unique value: " << it->first << endl;
}
}
And an explanation. Iterate over all elements in the vector<int>. If the current member is not in the map, set its value to true. If it is in the map, set the value to false. Afterwards, all values that have the value true are unique, and all values with false have one or more duplicates.

If you have more than two values (one of which has to be unique), you can do it in O(n) in time and space by iterating a first time through the array and filling a map that has as a key the value, and value the number of occurences of the key.
Then you just have to iterate through the map in order to find a value of 1. That would be a unique number.

This example uses a map to count number occurences. Unique number will be seen only one time:
#include <iostream>
#include <map>
#include <vector>
int main ()
{
std::map<int,int> mymap;
std::map<int,int>::iterator mit;
std::vector<int> v;
std::vector<int> myunique;
v.push_back(10); v.push_back(10);
v.push_back(20); v.push_back(30);
v.push_back(40); v.push_back(30);
std::vector<int>::iterator vit;
// count occurence of all numbers
for(vit=v.begin();vit!=v.end();++vit)
{
int number = *vit;
mit = mymap.find(number);
if( mit == mymap.end() )
{
// there's no record in map for your number yet
mymap[number]=1; // we have seen it for the first time
} else {
mit->second++; // thiw one will not be unique
}
}
// find the unique ones
for(mit=mymap.begin();mit!=mymap.end();++mit)
{
if( mit->second == 1 ) // this was seen only one time
{
myunique.push_back(mit->first);
}
}
// print out unique numbers
for(vit=myunique.begin();vit!=myunique.end();++vit)
std::cout << *vit << std::endl;
return 0;
}
Unique numbers in this example are 20 and 40. There's no need for the list to be ordered for this algorithm.

Do you mean to find a number in a vector which appears only once? The nested loop if the easy solution. I don't think std::find or std::find_if is very useful here. Another option is to sort the vector so that you only need to find two consecutive numbers that are different. It seems overkill, but it is actually O(nlogn) instead of O(n^2) as the nested loop:
void findUnique(const std::vector<int>& v, std::vector<int> &unique)
{
if(v.size() <= 1)
{
unique = v;
return;
}
unique.clear();
vector<int> w = v;
std::sort(w.begin(), w.end());
if(w[0] != w[1]) unique.push_back(w[0]);
for(size_t i = 1; i < w.size(); ++i)
if(w[i-1] != w[i]) unique.push_back(w[i]);
// unique contains the numbers that are not repeated
}

Assuming you are given an array size>=3 which contains one instance of value A, and all other values are B, then you can do this with a single for loop.
int find_odd(int* array, int length) {
// In the first three elements, we are guaranteed to have 2 common ones.
int common=array[0];
if (array[1]!=common && array[2]!=common)
// The second and third elements are the common one, and the one we thought was not.
return common;
// Now search for the oddball.
for (int i=0; i<length; i++)
if (array[i]!=common) return array[i];
}
EDIT:
K what if more than 2 in an array of 5 are different? – super
Ah... that is a different problem. So you have an array of size n, which contains the common element c more than once, and all other elements exactly once. The goal is to find the set of non-common (i.e. unique) elements right?
Then you need to look at Sylvain's answer above. I think he was answering a different question, but it would work for this. At the end, you will have a hash map full of the counts of each value. Loop through the hash map, and every time you see a value of 1, you will know the key is a unique value in the input array.

Related

Efficient way of finding if a container contains duplicated values with STL? [duplicate]

I wrote this code in C++ as part of a uni task where I need to ensure that there are no duplicates within an array:
// Check for duplicate numbers in user inputted data
int i; // Need to declare i here so that it can be accessed by the 'inner' loop that starts on line 21
for(i = 0;i < 6; i++) { // Check each other number in the array
for(int j = i; j < 6; j++) { // Check the rest of the numbers
if(j != i) { // Makes sure don't check number against itself
if(userNumbers[i] == userNumbers[j]) {
b = true;
}
}
if(b == true) { // If there is a duplicate, change that particular number
cout << "Please re-enter number " << i + 1 << ". Duplicate numbers are not allowed:" << endl;
cin >> userNumbers[i];
}
} // Comparison loop
b = false; // Reset the boolean after each number entered has been checked
} // Main check loop
It works perfectly, but I'd like to know if there is a more elegant or efficient way to check.
You could sort the array in O(nlog(n)), then simply look until the next number. That is substantially faster than your O(n^2) existing algorithm. The code is also a lot cleaner. Your code also doesn't ensure no duplicates were inserted when they were re-entered. You need to prevent duplicates from existing in the first place.
std::sort(userNumbers.begin(), userNumbers.end());
for(int i = 0; i < userNumbers.size() - 1; i++) {
if (userNumbers[i] == userNumbers[i + 1]) {
userNumbers.erase(userNumbers.begin() + i);
i--;
}
}
I also second the reccomendation to use a std::set - no duplicates there.
The following solution is based on sorting the numbers and then removing the duplicates:
#include <algorithm>
int main()
{
int userNumbers[6];
// ...
int* end = userNumbers + 6;
std::sort(userNumbers, end);
bool containsDuplicates = (std::unique(userNumbers, end) != end);
}
Indeed, the fastest and as far I can see most elegant method is as advised above:
std::vector<int> tUserNumbers;
// ...
std::set<int> tSet(tUserNumbers.begin(), tUserNumbers.end());
std::vector<int>(tSet.begin(), tSet.end()).swap(tUserNumbers);
It is O(n log n). This however does not make it, if the ordering of the numbers in the input array needs to be kept... In this case I did:
std::set<int> tTmp;
std::vector<int>::iterator tNewEnd =
std::remove_if(tUserNumbers.begin(), tUserNumbers.end(),
[&tTmp] (int pNumber) -> bool {
return (!tTmp.insert(pNumber).second);
});
tUserNumbers.erase(tNewEnd, tUserNumbers.end());
which is still O(n log n) and keeps the original ordering of elements in tUserNumbers.
Cheers,
Paul
It is in extension to the answer by #Puppy, which is the current best answer.
PS : I tried to insert this post as comment in the current best answer by #Puppy but couldn't so as I don't have 50 points yet. Also a bit of experimental data is shared here for further help.
Both std::set and std::map are implemented in STL using Balanced Binary Search tree only. So both will lead to a complexity of O(nlogn) only in this case. While the better performance can be achieved if a hash table is used. std::unordered_map offers hash table based implementation for faster search. I experimented with all three implementations and found the results using std::unordered_map to be better than std::set and std::map. Results and code are shared below. Images are the snapshot of performance measured by LeetCode on the solutions.
bool hasDuplicate(vector<int>& nums) {
size_t count = nums.size();
if (!count)
return false;
std::unordered_map<int, int> tbl;
//std::set<int> tbl;
for (size_t i = 0; i < count; i++) {
if (tbl.find(nums[i]) != tbl.end())
return true;
tbl[nums[i]] = 1;
//tbl.insert(nums[i]);
}
return false;
}
unordered_map Performance (Run time was 52 ms here)
Set/Map Performance
You can add all elements in a set and check when adding if it is already present or not. That would be more elegant and efficient.
I'm not sure why this hasn't been suggested but here is a way in base 10 to find duplicates in O(n).. The problem I see with the already suggested O(n) solution is that it requires that the digits be sorted first.. This method is O(n) and does not require the set to be sorted. The cool thing is that checking if a specific digit has duplicates is O(1). I know this thread is probably dead but maybe it will help somebody! :)
/*
============================
Foo
============================
*
Takes in a read only unsigned int. A table is created to store counters
for each digit. If any digit's counter is flipped higher than 1, function
returns. For example, with 48778584:
0 1 2 3 4 5 6 7 8 9
[0] [0] [0] [0] [2] [1] [0] [2] [2] [0]
When we iterate over this array, we find that 4 is duplicated and immediately
return false.
*/
bool Foo(int number)
{
int temp = number;
int digitTable[10]={0};
while(temp > 0)
{
digitTable[temp % 10]++; // Last digit's respective index.
temp /= 10; // Move to next digit
}
for (int i=0; i < 10; i++)
{
if (digitTable [i] > 1)
{
return false;
}
}
return true;
}
It's ok, specially for small array lengths. I'd use more efficient aproaches (less than n^2/2 comparisons) if the array is mugh bigger - see DeadMG's answer.
Some small corrections for your code:
Instead of int j = i writeint j = i +1 and you can omit your if(j != i) test
You should't need to declare i variable outside the for statement.
I think #Michael Jaison G's solution is really brilliant, I modify his code a little to avoid sorting. (By using unordered_set, the algorithm may faster a little.)
template <class Iterator>
bool isDuplicated(Iterator begin, Iterator end) {
using T = typename std::iterator_traits<Iterator>::value_type;
std::unordered_set<T> values(begin, end);
std::size_t size = std::distance(begin,end);
return size != values.size();
}
//std::unique(_copy) requires a sorted container.
std::sort(cont.begin(), cont.end());
//testing if cont has duplicates
std::unique(cont.begin(), cont.end()) != cont.end();
//getting a new container with no duplicates
std::unique_copy(cont.begin(), cont.end(), std::back_inserter(cont2));
#include<iostream>
#include<algorithm>
int main(){
int arr[] = {3, 2, 3, 4, 1, 5, 5, 5};
int len = sizeof(arr) / sizeof(*arr); // Finding length of array
std::sort(arr, arr+len);
int unique_elements = std::unique(arr, arr+len) - arr;
if(unique_elements == len) std::cout << "Duplicate number is not present here\n";
else std::cout << "Duplicate number present in this array\n";
return 0;
}
As mentioned by #underscore_d, an elegant and efficient solution would be,
#include <algorithm>
#include <vector>
template <class Iterator>
bool has_duplicates(Iterator begin, Iterator end) {
using T = typename std::iterator_traits<Iterator>::value_type;
std::vector<T> values(begin, end);
std::sort(values.begin(), values.end());
return (std::adjacent_find(values.begin(), values.end()) != values.end());
}
int main() {
int user_ids[6];
// ...
std::cout << has_duplicates(user_ids, user_ids + 6) << std::endl;
}
fast O(N) time and space solution
return first when it hits duplicate
template <typename T>
bool containsDuplicate(vector<T>& items) {
return any_of(items.begin(), items.end(), [s = unordered_set<T>{}](const auto& item) mutable {
return !s.insert(item).second;
});
}
Not enough karma to post a comment. Hence a post.
vector <int> numArray = { 1,2,1,4,5 };
unordered_map<int, bool> hasDuplicate;
bool flag = false;
for (auto i : numArray)
{
if (hasDuplicate[i])
{
flag = true;
break;
}
else
hasDuplicate[i] = true;
}
(flag)?(cout << "Duplicate"):("No duplicate");

I need a std function which checks how many elements occur exactly once in vector

Is there any STL function which does this?
For vector:
4 4 5 5 6 7
The expected output should be 2,because of one 6 and 7
Would you be kind to help me count them classic if there is no STL function?
I don't think there is an algorithm for that in STL. You can copy into a multimap or use a map of frequencies as suggested, but it does extra work that's not necessary because your array happens to be sorted. Here is a simple algorithm that counts the number of singular elements i.e. elements that appear only once in a sorted sequence.
int previous = v.front();
int current_count = 0;
int total_singular = 0;
for(auto n : v) {
if(previous == n) // check if same as last iteration
current_count++; // count the elements equal to current value
else {
if(current_count == 1) // count only those that have one copy for total
total_singular++;
previous = n;
current_count = 1; // reset counter, because current changed
}
}
if(current_count == 1) // check the last number
total_singular++;
You could also use count_if with a stateful lambda, but I don't think it'll make the algorithm any simpler.
If performance and memory doesn't matter to you, use std::map (or unordered version) for this task:
size_t count(const std::vector<int>& vec){
std::map<int,unsigned int> occurenceMap;
for (auto i : vec){
occurenceMap[i]++;
}
size_t count = 0U;
for (const auto& pair : occurenceMap){
if (pair.second == 1U){
count++;
}
}
return count;
}
with templates, it can be generalize to any container type and any containee type.
Use std::unique to count the unique entries(ct_u) and then user vector count on the original one (ct_o). Difference ct_o-ct_u would give the answer.
P.S.: this will only work if the identical entries are together in the original vector. If not, you may want to sort the vector first.
Using algorithm:
std::size_t count_unique(const std::vector<int>& v)
{
std::size_t count = 0;
for (auto it = v.begin(); it != v.end(); )
{
auto it2 = std::find_if(it + 1, v.end(), [&](int e) { return e != *it; });
count += (it2 - it == 1);
it = it2;
}
return count;
}
Demo

How to output a missing number in set in c++?

If I have a set in C++, and it contains numbers from 0 to n. I wish to find out the number that is missing from 1 to n and output that and if none of them is missing, then output the number (n+1).
For example, if the set contains, 0 1 2 3 4 5 6 7, then it should output 8
If it contains, 0 1 3 4 5 6, then it should output 2.
I made the following code for this, but it always seems to output 0. I dont know what is the problem.
set<int>::iterator i = myset.begin();
set<int>::iterator j = i++;
while (1)
{
if ( *(j) != *(i)+1 )
{
cout<<*(j)<<"\n";
break;
}
else
{
i++;
j++;
}
}
What is the problem? Thanks!
The problem is that you're advancing i:
set<int>::iterator i = myset.begin(); // <-- i points to first element
set<int>::iterator j = i++; // <-- j points to first element
// i points to second!
while (1)
{ // so if our set starts with {0, 1, ...}
if ( *(j) != *(i)+1 ) // then *j == 0, *i == 1, *i + 1 == 2, so this
// inequality holds
What you meant to do is have j be the next iterator after i:
std::set<int>::iterator i = myset.begin(), j = myset.begin();
std::advance(j, 1);
With C++11, there's also std::next():
auto i = myset.begin();
auto j = std::next(i, 1);
Or, alternatively, just reverse your construction:
std::set<int>::iterator j = myset.begin();
std::set<int>::iterator i = j++; // now i is the first element, j is the second
Or, lastly, you really only need one iterator:
int expected = 0;
for (std::set<int>::iterator it = myset.begin(); it != myset.end();
++it, ++expected)
{
if (*it != expected) {
std::cout << "Missing " << expected << std::endl;
break;
}
}
The easiest stuff: Use count() function of set to check whether an element is present in set or not.
The count() takes an integer argument: The number whose existence in the set is to be checked. If the element is present in set, count() returns a non zero value, else it returns 0.
For example:
#include <iostream>
#include <set>
using namespace std;
int main()
{
set<int> s;
//I insert 0 - 4 in the set.
for(int i=0;i < 5; ++i)
s.insert(i);
//Let 10 be the 'n'.
for(int i = 0; i < 10; ++i)
{
//If i is NOT present in the set, the below condition will be true.
if (!s.count(i))
cout<<i<<" is missing!\n";
}
}
One problem is that you access beyond the end of the set if the set is
dense, which is undefined behavior. Another is that you always output
an element in the set (*j, where j is an iterator into the set);
according to your description, what you want to output is a value which
isn't in the set.
There are a couple of ways of handling this. The simplest is probably
just to maintain a variable expected, initialized with 0 and
incremented each time. Something like the following.
int
firstAvailable( std::set<int> const& allocated )
{
int expected = 0;
for ( auto current = allocated.begin(), end = allocated.end();
current != end && *current == expected;
++ current ) {
++ expected;
}
return expected;
}
If you don't want to return 0 if the list isn't empty, but starts with
some other value, initialize expected with the first element of the
set (or with whatever you want to return if the set is empty).
EDIT:
Of course, other algorithms may be better. For this sort of thing, I usually use a bit map.
The problem with your original code has already been pointed out in the other answers. You're modifying i while assigning j. You can fix this by initializing the iterators as:
set<int>::iterator i = myset.begin();
set<int>::iterator j = i;
j++;
A more elegant solution is to take advantage of the fact that the sum of all values up to n is n * (n + 1) / 2. You can calculate the sum of the actual values, and subtract it from the full sum to obtain the missing value:
int findMissing(const std::set<int>& valSet) {
int valCount = valSet.size();
int allSum = (valCount * (valCount + 1)) >> 1;
int valSum = std::accumulate(valSet.begin(), valSet.end(), 0);
return allSum - valSum;
}
The big advantage of this approach is that it does not rely on using a container where iterators provide the values in sorted order. You can use the same solution e.g. on an unsorted std::vector.
One danger to look out for when using this approach is overflow. With 32-bit ints, it will overflow with approximately 2^16 values. It might actually still work if it overflows, particularly if you use unsigned instead of int, but I did not confirm that.
There's a similar approach that uses the XOR of all values instead of the sum, which does not have the problem with overflow.

Find number of occurrences of numbers in a vector in C++

I have a vector of integers. which contains numbers. I want to count the number of occurrences of every number in this vector. So what will be the optimum way to do this. As I am new to Vectors please let me know any optimum method.
You can use a hash table, implemented by std::unordered_map. For example:
#include <unordered_map>
#include <vector>
void count_occurrence(std::unordered_map<int,int>& m, std::vector<int>& v){
for (auto itr = v.begin(); itr != v.end(); ++itr){
++m[*itr];
}
}
//...somewhere else
//you already have std::vector v filled
std::unordered_map<int,int> m;
count_occurrence(m, v);
//print the number of occurrences of 1
std::cout<<m[1]<<std::endl;
You could sort the elements of the vector
Iterate through vector
Store the current integer as x
Compare current index to previous index.
If they are equal, increment another variable as f
If they are unequal, begin the cycle again
This of course is by no means a step by step instruction, but it contains enough direction to get you going
To find the mode of a number of integers stored in an array/list/vector/etc. where v is the DS and num is the number of integers.
You may use the following technique:
Its simple and sober.
i = 0;
int m = 0, mode, c = 0, nc = 0;
while(i<num)
{
c = 0;
nc = v[i];
c++;
i++;
while(nc == v[i])
{
c++;
i++;
}
if(m < c)
{
m = c;
mode = nc;
}
}

Given a vector with integers from 0 to n, but not all included, how do I efficiently get the non-included integers?

Given a vector with integers from 0 to n, but not all included, how do I efficiently get the non-included integers?
For example if I have a vector with 1 2 3 5, I need to get the vector that contains 0 4.
But I need to do it very efficiently.
Since the vector is already sorted, this becomes trivial:
vector<int> v = {1,2,3,5};
vector<int> ret;
v.push_back(n+1); // this is to enforce a limit using less branches in the loop
for(int i = 0, j = 0; i <= n; ++i){
int present = v[j++];
while(i < present){
ret.push_back(i++);
}
}
return ret;
Additionally, if it wasn't sorted, you could either sort it and apply the above algorithm, or, if you know the range of n, and you can afford the extra memory, you could instead create an array of boolean (or a bitset) and mark the index corresponding to every element you encounter (e.g. bitset[v[j++]] = true;), subsequently iterating from 0 to n and inserting into your vector every element whose bitset position has not been marked.
Basically the idea presented here is that we know the number of missing items beforehand if we can assume sorted input without duplicate values.
Then it is possible to pre-allocate enough space to hold the missing values beforehand (no later dynamic allocation required). Then we can also exploit the possible shortcut when all missing values were found.
If the input vector is not sorted or contains duplicate values, a wrapper function can be used that establishes this precondition.
#include <iostream>
#include <set>
#include <vector>
inline std::vector<int> find_missing(std::vector<int> const & input) {
// assuming non-empty, sorted input, no duplicates
// number of items missing
int n_missing = input.back() - input.size() + 1;
// pre-allocate enough memory for missing values
std::vector<int> result(n_missing);
// iterate input vector with shortcut if all missing values were found
auto input_it = input.begin();
auto result_it = result.begin();
for (int i = 0; result_it != result.end() && input_it != input.end(); ++i) {
if (i < *input_it) (*result_it++) = i;
else ++input_it;
}
return result;
}
// use this if the input vector is not sorted/unique
inline std::vector<int> find_missing_unordered(std::vector<int> const & input) {
std::set<int> values(input.begin(), input.end());
return find_missing(std::vector<int>(values.begin(), values.end()));
}
int main() {
std::vector<int> input = {1,2,3,5,5,5,7};
std::vector<int> result = find_missing_unordered(input);
for (int i : result)
std::cout << i << " ";
std::cout << "\n";
}
The output is:
$ g++ test.cc -std=c++11 && ./a.out
0 4 6