Removing duplicates from a sorted array in c++

Removing duplicates from a sorted array in c++ - c++

I am trying to remove the duplicates from sorted array.Code is giving correct output for one test case but fails to give correct output for multiple test cases.I am getting correct output with other methods but what is wrong with this method? How can I solve this problem?
#include <iostream>
#include<bits/stdc++.h>
using namespace std;
int main() {
// your code goes here
int t;
cin>>t;
while(t--){
int n;
cin>>n;
int a[n],i,k,temp,count;
for(i=0;i<n;i++){
cin>>a[i];
}
sort(a,a+n);
count=0;
for(i=0;i<n;i++){
if(a[i-1]-a[i]==0){
temp = a[i];
count++;
for(k=i;k<n;k++){
a[k] = a[k+1];
}
}
}
for(i=0;i<n-count;i++){
cout<<a[i]<<" ";
}
cout<<endl;
}
}

Variable length arrays like this
int a[n],i,k,temp,count;
is not a standard C++ feature. Instead you should use the standard container std::vector<int>.
This if statement
if(a[i-1]-a[i]==0){
invokes undefined behavior when i is equal to 0 due to the expression a[i-1].
The same problem exists in this for loop
for(k=i;k<n;k++){
a[k] = a[k+1];
}
when k is equal to n - 1 due to the expression a[k+1].
Also it is inefficient to copy all elements after the found duplicated element each time when such an element is found.
Pay attention to that there is the standard algorithm std::unique that can be used instead of your loops.
If to use the for loop then you may implement something like the following
#include <iostream>
int main()
{
int a[] = { 1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5 };
const size_t N = sizeof( a ) / sizeof( *a );
size_t n = 0;
for ( size_t i = 0; i < N; i++ )
{
if ( i == 0 || a[i] != a[n-1] )
{
if ( i != n ) a[n] = a[i];
++n;
}
}
for ( size_t i = 0; i < n; i++ )
{
std::cout << a[i] << ' ';
}
std::cout << '\n';
return 0;
}
The program output is
1 2 3 4 5
If to use the standard algorithm std::unique then the solution will be simpler because there is no need to write your own for loop.
#include <iostream>
#include <iterator>
#include <algorithm>
int main()
{
int a[] = { 1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5 };
auto last = std::unique( std::begin( a ), std::end( a ) );
for ( auto first = std::begin( a ); first != last; ++first )
{
std::cout << *first << ' ';
}
std::cout << '\n';
return 0;
}
The program output is the same as shown above that is
1 2 3 4 5

I see two major problems with your code, both are out-of-bounds reads from an array:
if(a[i-1]-a[i]==0) will at one point be called with i==0, accessing element a[-1].
And here:
for(k=i;k<n;k++){
a[k] = a[k+1];
}
in the last loop iteration, when k == n-1 array element a[n] will be accessed, which is also an out-of-bounds access.

Related

Error in Removal of Duplicates from a Sorted Array in C++ using Pointers

I have a task to remove the duplicates from a sorted array.
However, when I try this it doesn't remove anything and still gives me the same values in the output as the original.
I think I'm missing something in the removeDuplicates() function.
Also pointer notation would be recommended. Thank you!
void removeDuplicates(int *arr, int *size)
{
int s,*p,i,k=0;
p=arr;
s=*size;
int arr1[s];
for(i=0;i<s-1;i++)
{
if (*(p+i)!=*(p+i+1))
{
arr1[k++]=*(p+i);
}
}
arr1[k++]=*(p+s-1);
for(i=0; i<k; i++)
{
*(p+i) = arr1[i];
}
for(i=0; i<k; i++)
{
cout<<*(p+i)<<endl;
}
}

For starters variable length arrays as the array declared in your function
int arr1[s] = {};
is not a standard C++ feature. And moreover in C where variable length arrays exist you may not initialize them in their declarations.
Moreover if the source array contains only one or two different elements then the value of the variable k will be incorrect and equal to either 0 (instead of 1) or 1 (instead of 2).
Apart from this the function shall not output anything. It is the caller of the function decides whether to output the sub-array of unique elements. And as the second parameter is passed by reference in C meaning then it shall be changed within the function.
There is standard algorithm std::unique that can be used to do the task. Here is a demonstrative program.
#include <iostream>
#include <iterator>
#include <algorithm>
int main()
{
int a[] = { 1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5 };
auto last = std::unique( std::begin( a ), std::end( a ) );
for ( auto first = std::begin( a ); first != last; ++ first )
{
std::cout << *first << ' ';
}
std::cout << '\n';
return 0;
}
The program output is
1 2 3 4 5
If you want to write a similar function for arrays yourself using pointers within the function then it can look for example the following way
#include <iostream>
template <typename T>
size_t removeDuplicates( T *a, size_t n )
{
T *dest = a;
if ( n != 0 )
{
++dest;
for ( T *current = a; ++current != a + n; )
{
if ( *current != *( dest - 1 ) )
{
if ( dest != current )
{
*dest = *current;
}
++dest;
}
}
}
return dest - a;
}
int main()
{
int a[] = { 1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5 };
const size_t N = sizeof( a ) / sizeof( *a );
size_t last = removeDuplicates( a, N );
for ( size_t first = 0; first != last; ++first )
{
std::cout << a[first] << ' ';
}
std::cout << '\n';
return 0;
}
Again the program output is
1 2 3 4 5

strange index issue for a simple vector

I have one array like {7, 3, 1, 5, 2} for example. I want to get a max value from one index onwards, so the result should be like {7, 5, 5, 5, 2}. I just use a for loop like below. It throws strange heap overflow problem.
int maxProfit(vector<int>& prices) {
vector<int> maxRight;
int runningMax = 0;
for(auto i=(prices.size()-1);i>=0;i--){
if(runningMax < prices[i]){
runningMax = prices[i];
}
maxRight.push_back(runningMax);
}
maxRight.push_back(runningMax);
std::reverse(maxRight.begin(), maxRight.end());
......
But if i change to the below code, it works. Isn't the below code the same as the above one? I just changed the comparison of the index i to 0 or to 1.
int maxProfit(vector<int>& prices) {
vector<int> maxRight;
int runningMax = 0;
for(auto i=(prices.size()-1);i>=1;i--){
if(runningMax < prices[i]){
runningMax = prices[i];
}
maxRight.push_back(runningMax);
}
if(runningMax < prices[0]){
runningMax=prices[0];
}
std::reverse(maxRight.begin(), maxRight.end());
......

As was pointed out in a comment,
auto i=(prices.size()-1) , i is deduced to be unsigned value, and the condition i >= 0 is always true. You have accessing out of bounds of array.
Instead of using an index, use an iterator, in this case a std::vector::reverse_iterator.
for(auto it = prices.rbegin(); it != prices.rend(); ++it)
{
if(runningMax < *it)
{
runningMax = *it;
}
maxRight.push_back(runningMax);
}

As the variable i declared in the for loop
for(auto i=(prices.size()-1);i>=0;i--){
has the unsigned integer type std::vector<int>::size_type then you will get an infinite loop because when i is equal to 0 then the expression i-- will afain produces a non-negative number.
Another problem is that the for loop will again invoke undefined behavior if the passed vector is empty dues to the initialization in the declaration part of the for loop
auto i=(prices.size()-1)
because prices.size()-1 produces a positive value in this case.
In the second function implementation you forgot to push a calculated value for the first element of the vector prices. You just wrote after the loop
if(runningMax < prices[0]){
runningMax=prices[0];
}
that does not make a great sense.
`
You could write a separate function that returns the desired vector of maximum prices.
Here is a demonstrative program.
#include <iostream>
#include <vector>
#include <iterator>
#include <algorithm>
std::vector<int> max_prices( const std::vector<int> &prices )
{
std::vector<int> maxRight;
maxRight.reserve( prices.size() );
for ( auto first = std::rbegin( prices), last = std::rend( prices );
first != last;
++first )
{
if ( maxRight.empty() )
{
maxRight.push_back( *first );
}
else
{
maxRight.push_back( std::max( maxRight.back(), *first ) );
}
}
std::reverse( std::begin( maxRight ), std::end( maxRight ) );
return maxRight;
}
int main()
{
std::vector<int> prices = { 7, 3, 1, 5, 2 };
auto maxRight = max_prices( prices );
for ( const auto &price : maxRight )
{
std::cout << price << ' ';
}
std::cout << '\n';
return 0;
}
The program output is
7 5 5 5 2

Vector - value with k-occurences first

This is my first post and hope I'm not doing anything wrong.
I am trying to write a program that find the first value of the vector that reach k-occurrences in it.
For example, given this vector and k=3:
1 1 2 3 4 4 2 2 1 3
I would see 2 as output, because 2 is the first number reaching the 3rd occurrence.
The following code is what I tried to run, but somehow output is not correct.
#include<iostream>
#include<vector>
using namespace std;
int main()
{
vector<int> vettore;
int k;
int a,b,i;
int occ_a;
int occ_b;
cout<< "Write values of vector (number 0 ends the input of values)\n";
int ins;
cin>>ins;
while(ins)
{
vettore.push_back(ins); //Elements insertion
cin>>ins;
}
cout<<"how many occurrences?\n"<<endl;;
cin>>k;
if(k>0)
{
int i=0;
b = vettore[0];
occ_b=0;
while(i< vettore.size())
{
int j=i;
occ_a = 0;
a = vettore[i];
while(occ_a < k && j<vettore.size())
{
if(vettore[j]== a)
{
occ_a++;
vettore.erase(vettore.begin() + j);
}
else
j++;
}
if(b!=a && occ_b < occ_a)
b = a;
i++;
}
cout << b; //b is the value that reached k-occurrences first
}
return 0;
}
Hours have passed but I have not been able to solve it.
Thank you for your help!

Your code is difficult to read because you are declaring variables where they are not used. So their meanings is difficult to understand.
Also there is no need to remove elements from the vector. To find a value that is the first that occurs k-times is not equivalent to to change the vector. They are two different tasks.
I can suggest the following solution shown in the demonstrative program below.
#include <iostream>
#include <vector>
int main()
{
std::vector<int> v = { 1, 1, 2, 3, 4, 4, 2, 2, 1, 3 };
size_t least_last = v.size();
size_t k = 3;
for ( size_t i = 0; i + k <= least_last; i++ )
{
size_t count = 1;
size_t j = i;
while ( count < k && ++j < least_last )
{
if ( v[j] == v[i] ) ++count;
}
if ( count == k )
{
least_last = j;
}
}
if ( least_last != v.size() ) std::cout << v[least_last] << '\n';
return 0;
}.
The program output is
2
The idea is to find the last position of the first element that occurs k-times. As soon as it is found the upper limit of the traversed sequence is set to this value. So if there is another element that occurs k-times before this limit then it means that it occurs the first compared with already found element.

How do I find a particular value in an array and return its index?

Pseudo Code:
int arr[ 5 ] = { 4, 1, 3, 2, 6 }, x;
x = find(3).arr ;
x would then return 2.

The syntax you have there for your function doesn't make sense (why would the return value have a member called arr?).
To find the index, use std::distance and std::find from the <algorithm> header.
int x = std::distance(arr, std::find(arr, arr + 5, 3));
Or you can make it into a more generic function:
template <typename Iter>
size_t index_of(Iter first, Iter last, typename const std::iterator_traits<Iter>::value_type& x)
{
size_t i = 0;
while (first != last && *first != x)
++first, ++i;
return i;
}
Here, I'm returning the length of the sequence if the value is not found (which is consistent with the way the STL algorithms return the last iterator). Depending on your taste, you may wish to use some other form of failure reporting.
In your case, you would use it like so:
size_t x = index_of(arr, arr + 5, 3);

Here is a very simple way to do it by hand. You could also use the <algorithm>, as Peter suggests.
#include <iostream>
int find(int arr[], int len, int seek)
{
for (int i = 0; i < len; ++i)
{
if (arr[i] == seek) return i;
}
return -1;
}
int main()
{
int arr[ 5 ] = { 4, 1, 3, 2, 6 };
int x = find(arr,5,3);
std::cout << x << std::endl;
}

The fancy answer:
Use std::vector and search with std::find
The simple answer
Use a for loop

If the array is unsorted, you will need to use linear search.

#include <vector>
#include <algorithm>
int main()
{
int arr[5] = {4, 1, 3, 2, 6};
int x = -1;
std::vector<int> testVector(arr, arr + sizeof(arr) / sizeof(int) );
std::vector<int>::iterator it = std::find(testVector.begin(), testVector.end(), 3);
if (it != testVector.end())
{
x = it - testVector.begin();
}
return 0;
}
Or you can just build a vector in a normal way, without creating it from an array of ints and then use the same solution as shown in my example.

int arr[5] = {4, 1, 3, 2, 6};
vector<int> vec;
int i =0;
int no_to_be_found;
cin >> no_to_be_found;
while(i != 4)
{
vec.push_back(arr[i]);
i++;
}
cout << find(vec.begin(),vec.end(),no_to_be_found) - vec.begin();

We here use simply linear search. At first initialize the index equal to -1 . Then search the array , if found the assign the index value in index variable and break. Otherwise, index = -1.
int find(int arr[], int n, int key)
{
int index = -1;
for(int i=0; i<n; i++)
{
if(arr[i]==key)
{
index=i;
break;
}
}
return index;
}
int main()
{
int arr[ 5 ] = { 4, 1, 3, 2, 6 };
int n = sizeof(arr)/sizeof(arr[0]);
int x = find(arr ,n, 3);
cout<<x<<endl;
return 0;
}

You could use the STL algorithm library's find function provided
#include <iostream>
#include <algorithm>
using std::iostream;
using std::find;
int main() {
int length = 10;
int arr[length] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
int* found_pos = find(arr, arr + length, 5);
if(found_pos != (arr + length)) {
// found
cout << "Found: " << *found_pos << endl;
}
else {
// not found
cout << "Not Found." << endl;
}
return 0;
}

There is a find(...) function to find an element in an array which returns an iterator to that element. If the element is not found, the iterator point to the end of array.
In case the element is found, we can simply calculate the distance of the iterator from the beginning of the array to get the index of that element.
#include <iterator>
using namespace std;
int arr[ 5 ] = { 4, 1, 3, 2, 6 }
auto it = arr.find(begin(arr), end(arr), 3)
if(it != end(arr))
cerr << "Found at index: " << (it-begin(arr)) << endl;
else
cerr << "Not found\n";

How to get distinct values from an arrays of different sizes?

Q:
arr1[]={1,1,1,2,5,5,6,6,6,6,8,7,9}
Ans:
values[]={1,2,5,6,7,9}
Q:
arr1[]={1,1,1,2,5,5,6,6,6,6,8,7,9,101,1502,1502,1,9}
Ans:
values[]={1,2,5,6,7,9,101,1502}
here is what i tried but not working
for(int i=0;i<(index-1);i++) {
if(data[i].age != data[i+1].age) {
c=new list;
c->value=data[i].age;
c->next=NULL; clas++;
if(age_head==NULL) {
p=c; age_head=c;
}
for(c=age_head;c!=NULL,c->next!=NULL;p=c,c=c->next) {
if(data[i].age!=c->value)
found=false;
else
found=true;
}
if((age_head!=NULL)&& (found=false)) {
p->next=c; c->next=NULL;
}
}
}

This is not the most efficient, but it has some values:
It uses STL objects
It uses a cool little known template trick for knowing at compile time the size of your C-like arrays
...
int a[] = {1,1,1,2,5,5,6,6,6,6,8,7,9} ;
int b[] = {1,1,1,2,5,5,6,6,6,6,8,7,9,101,1502,1502,1,9} ;
// function setting the set values
template<size_t size>
void findDistinctValues(std::set<int> & p_values, int (&p_array)[size])
{
// Code modified after Jacob's excellent comment
p_values.clear() ;
p_values.insert(p_array, p_array + size) ;
}
void foo()
{
std::set<int> values ;
findDistinctValues(values, a) ;
// values now contain {1, 2, 5, 6, 7, 8, 9}
findDistinctValues(values, b) ;
// values now contain {1, 2, 5, 6, 7, 8, 9, 101, 1502}
}
Another version could return the set, instead of taking it by reference. It would then be:
int a[] = {1,1,1,2,5,5,6,6,6,6,8,7,9} ;
int b[] = {1,1,1,2,5,5,6,6,6,6,8,7,9,101,1502,1502,1,9} ;
// function returning the set
template<size_t size>
std::set<int> findDistinctValues(int (&p_array)[size])
{
// Code modified after Jacob's excellent comment
return std::set<int>(p_array, p_array + size) ;
}
void foo()
{
std::set<int> valuesOne = findDistinctValues(a) ;
// valuesOne now contain {1, 2, 5, 6, 7, 8, 9}
std::set<int> valuesTwo = findDistinctValues(b) ;
// valuesTwo now contain {1, 2, 5, 6, 7, 8, 9, 101, 1502}
}

The first thing I spot in your code is
if((age_head!=NULL)&& (found=false)) {
you use assignment (=) instead of equality (==). The expression should be
if((age_head!=NULL)&& (found==false)) {
Then, in this loop
for(c=age_head;c!=NULL,c->next!=NULL;p=c,c=c->next) {
you are looking for a value in the list. However, in its current form, when the loop terminates, found will show whether the last element in the list equals to c->value. You need to check for found in the loop condition (and you need to AND the expressions instead of listing them separated by comma!):
for(c=age_head, found = false; !found && c!=NULL && c->next!=NULL; ...) {
The result of the comma operator is the result of the last subexpression inside - this is definitely not what you want. Moreover, with comma all subexpressions are evaluated, which results in dereferencing a null pointer if c == NULL - whereas the && operator is evaluated lazily, thus c->next!=NULL is evaluated only if c != NULL.
The next thing is that you need to search for the value in the list before you add it to the list! Also note that you are trying to check for two different things: that the actual data element is different from the next one, and that its value is not yet added to the list. The second condition is stronger - it will always work, while the first only works if the input data is ordered. So you can omit the first check altogether. The result of all the above, plus some more simplifications and clarifications, is
for(int i=0;i<index;i++) {
for(list* c=age_head, found=false; !found&&c&&c->next; p=c,c=c->next) {
if(data[i].age==c->value)
found=true;
}
if(!found) {
list* newc=new list;
newc->value=data[i].age;
newc->next=NULL;
clas++;
if(age_head==NULL) {
p=newc; age_head=newc;
} else {
p->next=newc; newc->next=NULL;
}
}
}
I still don't guarantee that your linked list handling logic is right though :-) In its current form, your code is hard to understand, because the different logical steps are not separated. With a bit of refactoring, the code could look a lot clearer, e.g.
for(int i=0;i<index;i++) {
if(!foundInList(data[i].age)) {
addToList(data[i].age);
}
}
Of course the simplest and most efficient would be using STL containers/algorithms instead, as shown in other answers. But I think there is much more educational value in improving your first attempt :-)

If the output need not to be sorted, you can use a Hashtable.
E.g. something like this:
#include <boost/foreach.hpp>
#define foreach BOOST_FOREACH
#include <boost/unordered_set.hpp>
#include <vector>
using namespace std;
using namespace boost;
int main() {
int arr1[]={1,1,1,2,5,5,6,6,6,6,8,7,9};
size_t n = sizeof(arr1)/sizeof(int);
unordered_set<int> h;
for (size_t i = 0; i < n; ++i)
h.insert(arr1[i]);
vector<int> values;
foreach(int a, h)
values.push_back(a);
return 0;
}
The runtime is then in O(n).
An alternative to that is sorting the array and then to eliminate neighboring identical elements (advantage only STL is needed). But then the runtime is in O(n log n):
#include <vector>
#include <algorithm>
using namespace std;
int main() {
int arr1[]={1,1,1,2,5,5,6,6,6,6,8,7,9};
size_t n = sizeof(arr1)/sizeof(int);
sort(arr1, arr1+n);
int *end = unique(arr1, arr1+n);
vector<int> values(arr1, end);
return 0;
}

Easily done using STL.
int array[] = { 1, 1, 2, 2, 1, 3, 3, 4, 5, 4, 4, 1, 1, 2 };
int nElements = sizeof(array)/sizeof(array[0]);
std::sort(&array[0], &array[nElements]);
int newSize = std::unique(&array[0], &array[nElements]) - &array[0];

first you need to sort the array and than do something like this:
for(int i = 0; i < size -1; i++)
{
if(array[i]!=array[i+1])
unique++;
// store it wherever you want to.
stored.push(array[i]);
}

#include <vector>
#include <algorithm>
#include <iostream>
int
main ()
{
int array[] = { 1, 1, 2, 2, 1, 3, 3, 4, 5, 4, 4, 1, 1, 2 };
std::vector < int >values;
values.push_back (array[0]);
for (int i = 1; i < sizeof (array) / sizeof (int); ++i)
{
std::vector < int >::iterator it =
std::find (values.begin (), values.end (), array[i]);
if (it == values.end ())
values.push_back (array[i]);
}
std::cout << "Result:" << std::endl;
for (int i = 0; i < values.size (); i++)
std::cout << values[i] << std::endl;
}

This seems to be a duplicate of Removing duplicates in an array while preserving the order in C++
While the wording of the question is different, the result is the same.

Based on above ideas/codes, I am able to accomplish my job on finding distinct values in C++ array. Thanks every one who replied on this thread.
#include <set>
#include <iostream>
using namespace std;
// function setting the set values
template<size_t size>
void findDistinctValues(std::set<int> & p_values,int (&p_array)[size])
{
// Code modified after Jacob's excellent comment
p_values.clear() ;
p_values.insert(p_array, p_array + size) ;
}
void findDistinctValues2( int arr[],int size)
{
std::set<int> values_1 ;
std::set<int>::iterator it_1;
values_1.clear();
values_1.insert(arr,arr+size);
for (it_1=values_1.begin(); it_1!=values_1.end(); ++it_1)
std::cout << ' ' << *it_1<<endl;
}
int main()
{
int arr[] = {1,6100,4,94,93,-6,2,4,4,5,5,2500,5,4,5,2,3,6,1,15,16,0,0,99,0,0,34,99,6100,2500};
std::set<int> values ;
std::set<int>::iterator it;
int arr_size = sizeof(arr)/sizeof(int);
printf("Total no of array variables: %d\n",arr_size);
printf("Output from findDistinctValues (function 1)\n ");
findDistinctValues(values, arr) ;
for (it=values.begin(); it!=values.end(); ++it)
std::cout << ' ' << *it<<endl;
std::cout<<endl;
std::cout<<values.size()<<endl; //find the size of distict values
printf("Output from findDistinctValues (function 2) \n ");
findDistinctValues2(arr,arr_size);
getchar();
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Removing duplicates from a sorted array in c++ - c++

Related

Error in Removal of Duplicates from a Sorted Array in C++ using Pointers

strange index issue for a simple vector

Vector - value with k-occurences first

How do I find a particular value in an array and return its index?

How to get distinct values from an arrays of different sizes?

Categories

Resources