identifying duplicate values in linked list C++ - c++

I want to identify which ones and how many values are duplicate in a linked list that was user's input. And this is the code I wrote for it:
int count;
int compare, compare2;
for (p = first; p != NULL; p = p->next){
compare = p->num;
for (j = first; j != NULL; j = j->next){
if (compare == j->num){
compare2 = j->num;
count++;
}
}
if (count > 1){
cout << "There are at least 2 identical values of: " << compare2 << " that repeat for: " << count << "times" << endl;
}
}
Basically the idea of it was that I take the first element in the first loop and compare it to all the elements in the second loop and count if there are cases of them being similar, and print the result after - then I take next element and so on.
However the output is all the elements and it doesn't count correctly either. I'm just lost at how to adjust it.
I tried using the same p variable in both loops as it is the same list I want to loop, but then the .exe failed as soon as I'd finished input.
I saw a few examples around where there was function for deleting duplicate values, but the comparison part run through with while loop, and I'm just wondering - what am I doing wrong on this one?!

Your O(N*N) approach :
// Pick an element
for (p = first; p != NULL && p->next !=NULL ; p = p->next)
{ // Compare it with remaining elements
for (j = p->next ; j != NULL; j = j->next)
{
if ( p->num == j->num)
{
count++;
}
if( cout > 1 )
{
std::cout << p->num << " occurs "<< count << times << '\n' ;
}
}
Its better to use a HashMap to solve this is O(N) time with N extra space
std::unordered_map<int, int> m ;
for( p = first; p != NULL ; p = p->next )
{
m[ p->num ]++;
}
for (const auto &pair : m )
{
if( pair.second > 1 )
std::cout << pair.first << ": " << pair.second << '\n';
}

Your logic is flawed since both p and j iterate over the entire list. When p == j, the values are bound to match.
Change the block
if (compare == j->num){
compare2 = j->num;
count++;
}
to
if (p != j && compare == j->num){
compare2 = j->num;
count++;
}
Also, you don't need the line
compare2 = j->num;
since compare2 will be equal to compare.
You can reduce the number of tests by changing the inner for loop a bit. Then, you won't need the p != j bit either.
for (j = p->next; j != NULL; j = j->next){
if (compare == j->num){
count++;
}
}

The problem is that you don't exclude element you compare to (compare). So for every element it found at least one duplicate - itself!
Try to compare element in inner loop only followed by current (p).

Related

Quickselect algorithm for singly linked list C++

I need an algorithm which can find the median of a singly linked list in linear time complexity O(n) and constant space complexity O(1).
EDIT: The singly linked list is a C-style singly linked list. No stl allowed (no container, no functions, everything stl is forbidden, e.g no std::forward_list). Not allowed to move the numbers in any other container (like an array).
It's acceptable to have a space complexity of O(logn) as this will be actually even under 100 for my lists. Also I am not allowed to use the STL functions like the nth_element
Basically I have linked list with like 3 * 10^6 elements and I need to get the median in 3 seconds, so I can't use a sorting algoritm to sort the list (that will be O(nlogn) and will take something like 10-14 seconds maybe).
I've done some search online and I've found that it's posibile to find the median of an std::vector in O(n) and O(1) space compleity with quickselect (the worst case is in O(n^2), but it is rare), example: https://www.geeksforgeeks.org/quickselect-a-simple-iterative-implementation/
But I can't find any algoritm that does this for a linked list. The issue is that I can use the array index to randomly acces the vectorIf I want to modify that algoritm the complexity will be much bigger, because. For example when I change the pivotindex to the left I actually need to traverse the list to get that new element and go further (this will get me at least O(kn) with a big k for my list, even aproching O(n^2)...).
EDIT 2:
I know I have too many variables but I've been testing different stuff and I am still working on my code...
My current code:
#include <bits/stdc++.h>
using namespace std;
template <class T> class Node {
public:
T data;
Node<T> *next;
};
template <class T> class List {
public:
Node<T> *first;
};
template <class T> T getMedianValue(List<T> & l) {
Node<T> *crt,*pivot,*incpivot;
int left, right, lung, idx, lungrel,lungrel2, left2, right2, aux, offset;
pivot = l.first;
crt = pivot->next;
lung = 1;
//lung is the lenght of the linked list (yeah it's lenght in romanian...)
//lungrel and lungrel2 are the relative lenghts of the part of
//the list I am processing, e.g: 2 3 4 in a list with 1 2 3 4 5
right = left = 0;
while (crt != NULL) {
if(crt->data < pivot->data){
aux = pivot->data;
pivot->data = crt->data;
crt->data = pivot->next->data;
pivot->next->data = aux;
pivot = pivot->next;
left++;
}
else right++;
// cout<<crt->data<<endl;
crt = crt->next;
lung++;
}
if(right > left) offset = left;
// cout<<endl;
// cout<<pivot->data<<" "<<left<<" "<<right<<endl;
// printList(l);
// cout<<endl;
lungrel = lung;
incpivot = l.first;
// offset = 0;
while(left != right){
//cout<<"parcurgere"<<endl;
if(left > right){
//cout<<endl;
//printList(l);
//cout<<endl;
//cout<<"testleft "<<incpivot->data<<" "<<left<<" "<<right<<endl;
crt = incpivot->next;
pivot = incpivot;
idx = offset;left2 = right2 = lungrel = 0;
//cout<<idx<<endl;
while(idx < left && crt!=NULL){
if(pivot->data > crt->data){
// cout<<"1crt "<<crt->data<<endl;
aux = pivot->data;
pivot->data = crt->data;
crt->data = pivot->next->data;
pivot->next->data = aux;
pivot = pivot->next;
left2++;lungrel++;
}
else {
right2++;lungrel++;
//cout<<crt->data<<" "<<right2<<endl;
}
//cout<<crt->data<<endl;
crt = crt->next;
idx++;
}
left = left2 + offset;
right = lung - left - 1;
if(right > left) offset = left;
//if(pivot->data == 18) return 18;
//cout<<endl;
//cout<<"l "<<pivot->data<<" "<<left<<" "<<right<<" "<<right2<<endl;
// printList(l);
}
else if(left < right && pivot->next!=NULL){
idx = left;left2 = right2 = 0;
incpivot = pivot->next;offset++;left++;
//cout<<endl;
//printList(l);
//cout<<endl;
//cout<<"testright "<<incpivot->data<<" "<<left<<" "<<right<<endl;
pivot = pivot->next;
crt = pivot->next;
lungrel2 = lungrel;
lungrel = 0;
// cout<<"p right"<<pivot->data<<" "<<left<<" "<<right<<endl;
while((idx < lungrel2 + offset - 1) && crt!=NULL){
if(crt->data < pivot->data){
// cout<<"crt "<<crt->data<<endl;
aux = pivot->data;
pivot->data = crt->data;
crt->data = (pivot->next)->data;
(pivot->next)->data = aux;
pivot = pivot->next;
// cout<<"crt2 "<<crt->data<<endl;
left2++;lungrel++;
}
else right2++;lungrel++;
//cout<<crt->data<<endl;
crt = crt->next;
idx++;
}
left = left2 + left;
right = lung - left - 1;
if(right > left) offset = left;
// cout<<"r "<<pivot->data<<" "<<left<<" "<<right<<endl;
// printList(l);
}
else{
//cout<<cmx<<endl;
return pivot->data;
}
}
//cout<<cmx<<endl;
return pivot->data;
}
template <class T> void printList(List<T> const & l) {
Node<T> *tmp;
if(l.first != NULL){
tmp = l.first;
while(tmp != NULL){
cout<<tmp->data<<" ";
tmp = tmp->next;
}
}
}
template <class T> void push_front(List<T> & l, int x)
{
Node<T>* tmp = new Node<T>;
tmp->data = x;
tmp->next = l.first;
l.first = tmp;
}
int main(){
List<int> l;
int n = 0;
push_front(l, 19);
push_front(l, 12);
push_front(l, 11);
push_front(l, 101);
push_front(l, 91);
push_front(l, 21);
push_front(l, 9);
push_front(l, 6);
push_front(l, 25);
push_front(l, 4);
push_front(l, 18);
push_front(l, 2);
push_front(l, 8);
push_front(l, 10);
push_front(l, 200);
push_front(l, 225);
push_front(l, 170);
printList(l);
n=getMedianValue(l);
cout<<endl;
cout<<n;
return 0;
}
Do you have any sugestion on how to adapt quickselect to a singly listed link or other algoritm that would work for my problem ?
In your question, you mentioned that you are having trouble selecting a pivot that is not at the start of the list, because this would require traversing the list. If you do it correctly, you only have to traverse the entire list twice:
once for finding the middle and the end of the list in order to select a good pivot (e.g. using the "median-of-three" rule)
once for the actual sorting
The first step is not necessary if you don't care much about selecting a good pivot and you are happy with simply selecting the first element of the list as the pivot (which causes worst case O(n^2) time complexity if the data is already sorted).
If you remember the end of the list the first time you traverse it by maintaining a pointer to the end, then you should never have to traverse it again to find the end. Also, if you are using the standard Lomuto partition scheme (which I am not using for the reasons stated below), then you must also maintain two pointers into the list which represent the i and j index of the standard Lomuto partition scheme. By using these pointers, should never have to traverse the list for accessing a single element.
Also, if you maintain a pointer to the middle and the end of every partition, then, when you later must sort one of these partitions, you will not have to traverse that partition again to find the middle and end.
I have now created my own implementation of the QuickSelect algorithm for linked lists, which I have posted below.
Since you stated that the linked list is singly-linked and cannot be upgraded to a doubly-linked list, I can't use the Hoare partition scheme, as iterating a singly-linked list backwards is very expensive. Therefore, I am using the generally less efficient Lomuto partition scheme instead.
When using the Lomuto partition scheme, either the first element or the last element is typically selected as a pivot. However, selecting either of those has the disadvantage that sorted data will cause the algorithm to have the worst-case time complexity of O(n^2). This can be prevented by selecting a pivot according to the "median-of-three" rule, which is to select a pivot from the median value of the first element, middle element and last element. Therefore, in my implementation, I am using this "median-of-three" rule.
Also, the Lomuto partition scheme typically creates two partitions, one for values smaller than the pivot and one for values larger than or equal to the pivot. However, this will cause the worst-case time complexity of O(n^2) if all values are identical. Therefore, in my implementation, I am creating three partitions, one for values smaller than the pivot, one for values larger than the pivot, and one for values equal to the pivot.
Although these measures don't completely eliminate the possibility of worst-case time complexity of O(n^2), they at least make it highly unlikely (unless the input is provided by a malicious attacker). In order to guarantee a time complexity of O(n), a more complex pivot selection algorithm would have to be used, such as median of medians.
One significant problem I encountered is that for an even number of elements, the median is defined as the arithmetic mean of the two "middle" or "median" elements. For this reason, I can't simply write a function similar to std::nth_element, because if, for example, the total number of elements is 14, then I will be looking for the 7th and 8th largest element. This means I would have to call such a function twice, which would be inefficient. Therefore, I have instead written a function which can search for the two "median" elements at once. Although this makes the code more complex, the performance penalty due to the additional code complexity should be minimal compared to the advantage of not having to call the same function twice.
Please note that although my implementation compiles perfectly on a C++ compiler, I wouldn't call it textbook C++ code, because the question states that I am not allowed to use anything from the C++ standard template library. Therefore, my code is rather a hybrid of C code and C++ code.
In the following code, I only use the standard template library (in particular the function std::nth_element) for testing my algorithm and for verifying the results. I do not use any of these functions in my actual algorithm.
#include <iostream>
#include <iomanip>
#include <cassert>
// The following two headers are only required for testing the algorithm and verifying
// the correctness of its results. They are not used in the algorithm itself.
#include <random>
#include <algorithm>
// The following setting can be changed to print extra debugging information
// possible settings:
// 0: no extra debugging information
// 1: print the state and length of all partitions in every loop iteraton
// 2: additionally print the contents of all partitions (if they are not too big)
#define PRINT_DEBUG_LEVEL 0
template <typename T>
struct Node
{
T data;
Node<T> *next;
};
// NOTE:
// The return type is not necessarily the same as the data type. The reason for this is
// that, for example, the data type "int" requires a "double" as a return type, so that
// the arithmetic mean of "3" and "6" returns "4.5".
// This function may require template specializations to handle overflow or wrapping.
template<typename T, typename U>
U arithmetic_mean( const T &first, const T &second )
{
return ( static_cast<U>(first) + static_cast<U>(second) ) / 2;
}
//the main loop of the function find_median can be in one of the following three states
enum LoopState
{
//we are looking for one median value
LOOPSTATE_LOOKINGFORONE,
//we are looking for two median values, and the returned median
//will be the arithmetic mean of the two
LOOPSTATE_LOOKINGFORTWO,
//one of the median values has been found, but we are still searching for
//the second one
LOOPSTATE_FOUNDONE
};
template <
typename T, //type of the data
typename U //type of the return value
>
U find_median( Node<T> *list )
{
//This variable points to the pointer to the first element of the current partition.
//During the partition phase, the linked list will be broken and reassembled afterwards, so
//the pointer this pointer points to will be nullptr until it is reassembled.
Node<T> **pp_start = &list;
//This pointer represents nothing more than the cached value of *pp_start and it is
//not always valid
Node<T> *p_start = *pp_start;
//These pointers are maintained for accessing the middle of the list for selecting a pivot
//using the "median-of-three" rule.
Node<T> *p_middle;
Node<T> *p_end;
//result is not defined if list is empty
assert( p_start != nullptr );
//in the main loop, this variable always holds the number of elements in the current partition
int num_total = 1;
// First, we must traverse the entire linked list in order to determine the number of elements,
// in order to calculate k1 and k2. If it is odd, then the median is defined as the k'th smallest
// element where k = n / 2. If the number of elements is even, then the median is defined as the
// arithmetic mean of the k'th element and the (k+1)'th element.
// We also set a pointer to the nodes in the middle and at the end, which will be required later
// for selecting a pivot according to the "median-of-three" rule.
p_middle = p_start;
for ( p_end = p_start; p_end->next != nullptr; p_end = p_end->next )
{
num_total++;
if ( num_total % 2 == 0 ) p_middle = p_middle->next;
}
// find out whether we are looking for only one or two median values
enum LoopState loop_state = num_total % 2 == 0 ? LOOPSTATE_LOOKINGFORTWO : LOOPSTATE_LOOKINGFORONE;
//set k to the index of the middle element, or if there are two middle elements, to the left one
int k = ( num_total - 1 ) / 2;
// If we are looking for two median values, but we have only found one, then this variable will
// hold the value of the one we found. Whether we have found one can be determined by the state of
// the variable loop_state.
T val_found;
for (;;)
{
//make p_start cache the value of *pp_start again, because a previous iteration of the loop
//may have changed the value of pp_start
p_start = *pp_start;
assert( p_start != nullptr );
assert( p_middle != nullptr );
assert( p_end != nullptr );
assert( num_total != 0 );
if ( num_total == 1 )
{
switch ( loop_state )
{
case LOOPSTATE_LOOKINGFORONE:
return p_start->data;
case LOOPSTATE_FOUNDONE:
return arithmetic_mean<T,U>( val_found, p_start->data );
default:
assert( false ); //this should be unreachable
}
}
//select the pivot according to the "median-of-three" rule
T pivot;
if ( p_start->data < p_middle->data )
{
if ( p_middle->data < p_end->data )
pivot = p_middle->data;
else if ( p_start->data < p_end->data )
pivot = p_end->data;
else
pivot = p_start->data;
}
else
{
if ( p_start->data < p_end->data )
pivot = p_start->data;
else if ( p_middle->data < p_end->data )
pivot = p_end->data;
else
pivot = p_middle->data;
}
#if PRINT_DEBUG_LEVEL >= 1
//this line is conditionally compiled for extra debugging information
std::cout << "\nmedian of three: " << (*pp_start)->data << " " << p_middle->data << " " << p_end->data << " ->" << pivot << std::endl;
#endif
// We will be dividing the current partition into 3 new partitions (less-than,
// equal-to and greater-than) each represented as a linked list. Each list
// requires a pointer to the start of the list and a pointer to the pointer at
// the end of the list to write the address of new elements to. Also, when
// traversing the lists, we need to keep a pointer to the middle of the list,
// as this information will be required for selecting a new pivot in the next
// iteration of the loop. The latter is not required for the equal-to partition,
// as it would never be used.
Node<T> *p_less = nullptr, **pp_less_end = &p_less, **pp_less_middle = &p_less;
Node<T> *p_equal = nullptr, **pp_equal_end = &p_equal;
Node<T> *p_greater = nullptr, **pp_greater_end = &p_greater, **pp_greater_middle = &p_greater;
// These pointers are only used as a cache to the location of the end node.
// Despite their similar name, their function is quite different to pp_less_end
// and pp_greater_end.
Node<T> *p_less_end = nullptr;
Node<T> *p_greater_end = nullptr;
// counter for the number of elements in each partition
int num_less = 0;
int num_equal = 0;
int num_greater = 0;
// NOTE:
// The following loop will temporarily split the linked list. It will be merged later.
Node<T> *p_next_node = p_start;
//the following line isn't necessary; it is only used to clarify that the pointers no
//longer point to anything meaningful
*pp_start = p_start = nullptr;
for ( int i = 0; i < num_total; i++ )
{
assert( p_next_node != nullptr );
Node<T> *p_current_node = p_next_node;
p_next_node = p_next_node->next;
if ( p_current_node->data < pivot )
{
//link node to pp_less
assert( *pp_less_end == nullptr );
*pp_less_end = p_less_end = p_current_node;
pp_less_end = &p_current_node->next;
p_current_node->next = nullptr;
num_less++;
if ( num_less % 2 == 0 )
{
pp_less_middle = &(*pp_less_middle)->next;
}
}
else if ( p_current_node->data == pivot )
{
//link node to pp_equal
assert( *pp_equal_end == nullptr );
*pp_equal_end = p_current_node;
pp_equal_end = &p_current_node->next;
p_current_node->next = nullptr;
num_equal++;
}
else
{
//link node to pp_greater
assert( *pp_greater_end == nullptr );
*pp_greater_end = p_greater_end = p_current_node;
pp_greater_end = &p_current_node->next;
p_current_node->next = nullptr;
num_greater++;
if ( num_greater % 2 == 0 )
{
pp_greater_middle = &(*pp_greater_middle)->next;
}
}
}
assert( num_total == num_less + num_equal + num_greater );
assert( num_equal >= 1 );
#if PRINT_DEBUG_LEVEL >= 1
//this section is conditionally compiled for extra debugging information
{
std::cout << std::setfill( '0' );
switch ( loop_state )
{
case LOOPSTATE_LOOKINGFORONE:
std::cout << "LOOPSTATE_LOOKINGFORONE k = " << k << "\n";
break;
case LOOPSTATE_LOOKINGFORTWO:
std::cout << "LOOPSTATE_LOOKINGFORTWO k = " << k << "\n";
break;
case LOOPSTATE_FOUNDONE:
std::cout << "LOOPSTATE_FOUNDONE k = " << k << " val_found = " << val_found << "\n";
}
std::cout << "partition lengths: ";
std::cout <<
std::setw( 2 ) << num_less << " " <<
std::setw( 2 ) << num_equal << " " <<
std::setw( 2 ) << num_greater << " " <<
std::setw( 2 ) << num_total << "\n";
#if PRINT_DEBUG_LEVEL >= 2
Node<T> *p;
std::cout << "less: ";
if ( num_less > 10 )
std::cout << "too many to print";
else
for ( p = p_less; p != nullptr; p = p->next ) std::cout << p->data << " ";
std::cout << "\nequal: ";
if ( num_equal > 10 )
std::cout << "too many to print";
else
for ( p = p_equal; p != nullptr; p = p->next ) std::cout << p->data << " ";
std::cout << "\ngreater: ";
if ( num_greater > 10 )
std::cout << "too many to print";
else
for ( p = p_greater; p != nullptr; p = p->next ) std::cout << p->data << " ";
std::cout << "\n\n" << std::flush;
#endif
std::cout << std::flush;
}
#endif
//insert less-than partition into list
assert( *pp_start == nullptr );
*pp_start = p_less;
//insert equal-to partition into list
assert( *pp_less_end == nullptr );
*pp_less_end = p_equal;
//insert greater-than partition into list
assert( *pp_equal_end == nullptr );
*pp_equal_end = p_greater;
//link list to previously cut off part
assert( *pp_greater_end == nullptr );
*pp_greater_end = p_next_node;
//if less-than partition is large enough to hold both possible median values
if ( k + 2 <= num_less )
{
//set the next iteration of the loop to process the less-than partition
//pp_start is already set to the desired value
p_middle = *pp_less_middle;
p_end = p_less_end;
num_total = num_less;
}
//else if less-than partition holds one of both possible median values
else if ( k + 1 == num_less )
{
if ( loop_state == LOOPSTATE_LOOKINGFORTWO )
{
//the equal_to partition never needs sorting, because all members are already equal
val_found = p_equal->data;
loop_state = LOOPSTATE_FOUNDONE;
}
//set the next iteration of the loop to process the less-than partition
//pp_start is already set to the desired value
p_middle = *pp_less_middle;
p_end = p_less_end;
num_total = num_less;
}
//else if equal-to partition holds both possible median values
else if ( k + 2 <= num_less + num_equal )
{
//the equal_to partition never needs sorting, because all members are already equal
if ( loop_state == LOOPSTATE_FOUNDONE )
return arithmetic_mean<T,U>( val_found, p_equal->data );
return p_equal->data;
}
//else if equal-to partition holds one of both possible median values
else if ( k + 1 == num_less + num_equal )
{
switch ( loop_state )
{
case LOOPSTATE_LOOKINGFORONE:
return p_equal->data;
case LOOPSTATE_LOOKINGFORTWO:
val_found = p_equal->data;
loop_state = LOOPSTATE_FOUNDONE;
k = 0;
//set the next iteration of the loop to process the greater-than partition
pp_start = pp_equal_end;
p_middle = *pp_greater_middle;
p_end = p_greater_end;
num_total = num_greater;
break;
case LOOPSTATE_FOUNDONE:
return arithmetic_mean<T,U>( val_found, p_equal->data );
}
}
//else both possible median values must be in the greater-than partition
else
{
k = k - num_less - num_equal;
//set the next iteration of the loop to process the greater-than partition
pp_start = pp_equal_end;
p_middle = *pp_greater_middle;
p_end = p_greater_end;
num_total = num_greater;
}
}
}
// NOTE:
// The following code is not part of the algorithm, but is only intended to test the algorithm
// This simple class is designed to contain a singly-linked list
template <typename T>
class List
{
public:
List() : first( nullptr ) {}
// the following is required to abide by the rule of three/five/zero
// see: https://en.cppreference.com/w/cpp/language/rule_of_three
List( const List<T> & ) = delete;
List( const List<T> && ) = delete;
List<T>& operator=( List<T> & ) = delete;
List<T>& operator=( List<T> && ) = delete;
~List()
{
Node<T> *p = first;
while ( p != nullptr )
{
Node<T> *temp = p;
p = p->next;
delete temp;
}
}
void push_front( int data )
{
Node<T> *temp = new Node<T>;
temp->data = data;
temp->next = first;
first = temp;
}
//member variables
Node<T> *first;
};
int main()
{
//generated random numbers will be between 0 and 2 billion (fits in 32-bit signed int)
constexpr int min_val = 0;
constexpr int max_val = 2*1000*1000*1000;
//will allocate array for 1 million ints and fill with random numbers
constexpr int num_values = 1*1000*1000;
//this class contains the singly-linked list and is empty for now
List<int> l;
double result;
//These variables are used for random number generation
std::random_device rd;
std::mt19937 gen( rd() );
std::uniform_int_distribution<> dis( min_val, max_val );
try
{
//fill array with random data
std::cout << "Filling array with random data..." << std::flush;
auto unsorted_data = std::make_unique<int[]>( num_values );
for ( int i = 0; i < num_values; i++ ) unsorted_data[i] = dis( gen );
//fill the singly-linked list
std::cout << "done\nFilling linked list..." << std::flush;
for ( int i = 0; i < num_values; i++ ) l.push_front( unsorted_data[i] );
std::cout << "done\nCalculating median using STL function..." << std::flush;
//calculate the median using the functions provided by the C++ standard template library.
//Note: this is only done to compare the results with the algorithm provided in this file
if ( num_values % 2 == 0 )
{
int median1, median2;
std::nth_element( &unsorted_data[0], &unsorted_data[(num_values - 1) / 2], &unsorted_data[num_values] );
median1 = unsorted_data[(num_values - 1) / 2];
std::nth_element( &unsorted_data[0], &unsorted_data[(num_values - 0) / 2], &unsorted_data[num_values] );
median2 = unsorted_data[(num_values - 0) / 2];
result = arithmetic_mean<int,double>( median1, median2 );
}
else
{
int median;
std::nth_element( &unsorted_data[0], &unsorted_data[(num_values - 0) / 2], &unsorted_data[num_values] );
median = unsorted_data[(num_values - 0) / 2];
result = static_cast<int>(median);
}
std::cout << "done\nMedian according to STL function: " << std::setprecision( 12 ) << result << std::endl;
// NOTE: Since the STL functions only sorted the array, but not the linked list, the
// order of the linked list is still random and not pre-sorted.
//calculate the median using the algorithm provided in this file
std::cout << "Starting algorithm" << std::endl;
result = find_median<int,double>( l.first );
std::cout << "The calculated median is: " << std::setprecision( 12 ) << result << std::endl;
std::cout << "Cleaning up\n\n" << std::flush;
}
catch ( std::bad_alloc )
{
std::cerr << "Error: Unable to allocate sufficient memory!" << std::endl;
return -1;
}
return 0;
}
I have successfully tested my code with one million randomly generated elements and it found the correct median virtually instantaneously.
So what you can do is use iterators to hold the position. I have written the algorithm above to work with the std::forward_list. I know this isn't perfect, but wrote this up quickly and hope it helps.
int partition(int leftPos, int rightPos, std::forward_list<int>::iterator& currIter,
std::forward_list<int>::iterator lowIter, std::forward_list<int>::iterator highIter) {
auto iter = lowIter;
int i = leftPos - 1;
for(int j = leftPos; j < rightPos - 1; j++) {
if(*iter <= *highIter) {
++currIter;
++i;
std::iter_swap(currIter, iter);
}
iter++;
}
std::forward_list<int>::iterator newIter = currIter;
std::iter_swap(++newIter, highIter);
return i + 1;
}
std::forward_list<int>::iterator kthSmallest(std::forward_list<int>& list,
std::forward_list<int>::iterator left, std::forward_list<int>::iterator right, int size, int k) {
int leftPos {0};
int rightPos {size};
int pivotPos {0};
std::forward_list<int>::iterator resetIter = left;
std::forward_list<int>::iterator currIter = left;
++left;
while(leftPos <= rightPos) {
pivotPos = partition(leftPos, rightPos, currIter, left, right);
if(pivotPos == (k-1)) {
return currIter;
} else if(pivotPos > (k-1)) {
right = currIter;
rightPos = pivotPos - 1;
} else {
left = currIter;
++left;
resetIter = left;
++left;
leftPos = pivotPos + 1;
}
currIter = resetIter;
}
return list.end();
}
When makeing a call to kth iter, the left iterator should be one less than where you intend to start that. This allows us to be one position behind low in partition(). Here is an example of executing it:
int main() {
std::forward_list<int> list {10, 12, 12, 13, 4, 5, 8, 11, 6, 26, 15, 21};
auto startIter = list.before_begin();
int k = 6;
int size = getSize(list);
auto kthIter = kthSmallest(list, startIter, getEnd(list), size - 1, k);
std::cout << k << "th smallest: " << *kthIter << std::endl;
return 0;
}
6th smallest: 10

Prevent loop from echoing if another same-value array element has been already echoed in C++

First of all, sorry for the mis-worded title. I couldn't imagine a better way to put it.
The problem I'm facing is as follows: In a part of my program, the program counts occurences of different a-zA-Z letters and then tells how many of each letters can be found in an array. The problem, however, is this:
If I have an array that consists of A;A;F;A;D or anything similar, the output will be this:
A - 3
A - 3
F - 1
A - 3
D - 1
But I am required to make it like this:
A - 3
F - 1
D - 1
I could solve the problem easily, however I can't use an additional array to check what values have been already echoed. I know why it happens, but I don't know a way to solve it without using an additional array.
This is the code snippet (the array simply consists of characters, not worthy of adding it to the snippet):
n is the size of array the user is asked to choose at the start of the program (not included in the snippet).
initburts is the current array member ID that is being compared against all other values.
burts is the counter that is being reset after the loop is done checking a letter and moves onto the next one.
do {
for (i = 0; i < n; i++) {
if (array[initburts] == array[i]) {
burts++;
}
}
cout << "\n\n" << array[initburts] << " - " << burts;
initburts++;
burts = 0;
if (initburts == n) {
isDone = true;
}
}
while (isDone == false);
Do your counting first, then loop over your counts printing the results.
std::map<decltype(array[0]), std::size_t> counts;
std::for_each(std::begin(array), std::end(array), [&counts](auto& item){ ++counts[item]; });
std::for_each(std::begin(counts), std::end(counts), [](auto& pair) { std::cout << "\n\n" << pair.first << " - " pair.second; });
for (i = 0; i < n; i++)
{
// first check if we printed this character already;
// this is the case if the same character occurred
// before the current one:
bool isNew = true;
for (j = 0; j < i; j++)
{
// you find out yourself, do you?
// do not forget to break the loop
// in case of having detected an equal value!
}
if(isNew)
{
// well, now we can count...
unsigned int count = 1;
for(int j = i + 1; j < n; ++j)
count += array[j] == array[i];
// appropriate output...
}
}
That would do the trick and retains the array as is, however is an O(n²) algorithm. More efficient (O(n*log(n))) is sorting the array in advance, then you can just iterate over the array once. Of course, original array sequence gets lost then:
std::sort(array, array + arrayLength);
auto start = array;
for(auto current = array + 1; current != array + arrayLength; ++current)
{
if(*current != *start)
{
auto char = *start;
auto count = current - start;
// output char and count appropriately
}
}
// now we yet lack the final character:
auto char = *start;
auto count = array + arrayLength - start;
// output char and count appropriately
Pointer arithmetic... Quite likely that your teacher gets suspicious if you just copy this code, but it should give you the necessary hints to make up your own variant (use indices instead of pointers...).
I would do it this way.
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main()
{
string s;
vector<int> capCount(26, 0), smallCount(26, 0);
cout << "Enter the string\n";
cin >> s;
for(int i = 0; i < s.length(); ++i)
{
char c = s.at(i);
if(c >= 'A' && c <= 'Z')
++capCount[(int)c - 65];
if(c >= 'a' && c <= 'z')
++smallCount[(int)c - 97];
}
for(int i = 0; i < 26; ++i)
{
if(capCount[i] > 0)
cout << (char) (i + 65) << ": " << capCount[i] << endl;
if(smallCount[i] > 0)
cout << (char) (i + 97) << ": " << smallCount[i] << endl;
}
}
Note: I have differentiated lower and upper case characters.
Here's is the sample output:
output

Shifting Objects Up in an Array

I'm creating a program that creates an array of objects in random positions in an array size 8. Once created, I need them to sort so that all the objects in the array are shifted up to the top, so no gaps exist between them. I'm almost there, but I cannot seem to get them to swap to index 0 in the array, and they instead swap to index 1. Any suggestions? (Must be done the way I'm doing it, not with other sorting algorithms or whatnot)
#include <iostream>
#include <string>
#include <ctime>
using namespace std;
struct WordCount {
string name = "";
int count = 0;
};
int main() {
cout << "Original random array: " << endl;
srand(static_cast<int>(time(0)));
int i = 0;
WordCount wordArr[8];
while (i < 4) {
int randomNum = 0 + (rand() % static_cast<int>(7 + 1));
if(wordArr[randomNum].name == "") {
wordArr[randomNum].name = "word" + static_cast<char>(i);
wordArr[randomNum].count = i;
i++;
}
}
int j = 0;
while (j < 8) {
cout << wordArr[j].name << " " << wordArr[j].count << endl;
j++;
}
cout << "\n\nSorted array: " << endl;
for (int i = 7; i >= 0; i--) {
for (int j = 0; j <= 7; j++) {
if (wordArr[i].name != "") {
if (wordArr[j].name == "") {
WordCount temp = wordArr[i];
wordArr[i] = wordArr[j];
wordArr[j] = temp;
}
}
}
}
int k = 0;
while (k < 8) {
cout << wordArr[k].name << " " << wordArr[k].count << endl;
k++;
}
return 0;
}
If I understand your requirement correctly, you want to move all the non-blank entries to the start of the array. To do this, you need an algorithm like this for example:
for i = 0 to 7
if wordArr[i].name is blank
for j = i + 1 to 7
if wordArr[j].name is not blank
swap [i] and [j]
break
So, starting from the beginning, if we encounter a blank entry, we look forward for the next non-blank entry. If we find such an entry, we swap the blank and non-blank entry, then break to loop again looking for the next blank entry.
Note, this isn't the most efficient of solutions, but it will get you started.
Note also I'd replace the 4 and 8 with definitions like:
#define MAX_ENTRIES (8)
#define TO_GENERATE_ENTRIES (4)
Finally:
wordArr[randomNum].name = "word" + static_cast<char>(i);
That will not do what you want it to do; try:
wordArr[randomNum].name = "word" + static_cast<char>('0' + i);
To append the digits, not the byte codes, to the end of the number. Or perhaps, if you have C++11:
wordArr[randomNum].name = "word" + std::to_string(i);
I see couple of problems.
The expression "word" + static_cast<char>(i); doesn't do what you are hoping to do.
It is equivalent to:
char const* w = "word";
char const* p = w + i;
When i is 2, p will be "rd". You need to use std::string("word") + std::to_string(i).
The logic for moving objects with the non-empty names to objects with empty names did not make sense to me. It obviously does not work for you. The following updated version works for me:
for (int i = 0; i <= 7; ++i) {
// If the name of the object at wordArr[i] is not empty, move on to the
// next item in the array. If it is empty, copy the next object that
// has a non-empty name.
if ( wordArr[i].name == "") {
// Start comparing from the object at wordArr[i+1]. There
// is no need to start at wordArr[i]. We know that it is empty.
for (int j = i+1; j <= 7; ++j) {
if (wordArr[j].name != "") {
WordCount temp = wordArr[i];
wordArr[i] = wordArr[j];
wordArr[j] = temp;
}
}
}
}
There was two problems as :
wordArr[randomNum].name = "word" + static_cast<char>(i); this is not what your are looking for, if you want that your names generate correctly you need something like this :
wordArr[randomNum].name = "word " + std::to_string(i);
Your sorting loop does not do what you want, it's just check for the "gaps" as you said, you need something like this :
for (int i = 0; i < 8; ++i) {
for (int j = i+1; j < 8; ++j) {
if (wordArr[i].name == "" || (wordArr[i].count < wordArr[j].count)) {
WordCount temp = wordArr[i];
wordArr[i] = wordArr[j];
wordArr[j] = temp;
}
}
}
Your algorithm sorts the array, but then looses the sorting again.
You want to swap elements only when i > j, in order to push elements to the top only. As a result, you need to change this:
if (wordArr[j].name == "")
to this:
if (wordArr[j].name == "" && i > j)
Consider this array example:
0
ord 1
0
0
rd 2
word 0
d 3
0
Your code will sort it to:
d 3
ord 1
word 0
rd 2
0
0
0
0
but when i = 3, it will try to populate the 5th cell, and it will swap it with rd 2, which is not what we want.
This will push rd 2 down, but we don't want that, we want gaps (zeroes) to go to the end of the array, thus we need to swap eleemnts only when they are going to go higher, not lower, which is equivalent to say when i > j.
PS: If you are a beginner skip that part.
You can optimize the inner loop by using one if statement and a break keyword, like this:
for (int j = 0; j <= 7; j++) {
if (wordArr[i].name != "" && wordArr[j].name == "" && i > j) {
WordCount temp = wordArr[i];
wordArr[i] = wordArr[j];
wordArr[j] = temp;
break;
}
}

Binary search not working for n = 1, 2

This is my code for binary search, and n = no of elements in array
// Binary Search
// BUG: not working for n = 2
#include <iostream>
int main() {
const int n = 1;
int newlist[n];
std::cout << "Enter " << n;
std::cout << " elements in increasing order:\n";
for( int i = 0; i < n; ++i ) {
std::cin >> newlist[i];
}
int pos = 0, num;
std::cout << "Enter number:\n";
std::cin >> num;
std::cout << '\n';
int imin = 0, imax = n-1;
int imid = (n - 1)/2;
for( int i = 0; i < n; ++i ) {
imid = (imin + imax) / 2;
if( newlist[imid] == num ) {
pos = imid;
}
else if( newlist[imid] < num ) {
imin = imid+1;
}
else {
imax = imid-1;
}
}
if( pos != 0 ) {
std::cout << "Found at " << pos+1;
}
else {
std::cout << "Not found!\n";
}
return 0;
}
It does work for n > 2, but fails to give correct output for n <= 2, ie, gives Not found! output even for elements that were found.
I think one way would be to have a separate implementation for n <= 2, but that will become cumbersome! Please help.
Set your pos operator to -1 rather than 0. 0 represents your first index and since you output that the element has not been found for pos == 0 condition, your code is failing. You should set pos to -1 initially and check that itself for not found condition, if an element is found at pos = 0, that means the element exists at the first index.
First pos equal to 0 is correct value. Therefore set pos to -1 at the beginning and compare to -1 (or more commonly >= 0) when checking whether it was found.
Secondly, there are few items that should be changed because right now it's not that much binary search:
There is no reason to initialize mid before the loop, it's just temporary variable with the scope in loop block.
The condition for exiting the search is min > max, you don't need any additional counter, as it would run the loop always n times even if the value didn't exist. So change to while (min <= max) { ...
Last but not least, once you find the item, exit the loop immediately by break statement.
I don't think a for-loop is the control structure to go for here, because you want to finish when you've either found the correct item or when imin and imax are non-sensical.
In the implementation given, you don't even stop the loop when you have found the item and just confirm the found item "n-(number of iterations until item was found)" times.
Furthermore, with C++ arrays and vectors being 0-based, having position == 0 as the marker for "not found" is a bad idea; you could instead use an item from http://en.cppreference.com/w/cpp/types/numeric_limits, or n (since the indices go from 0 to n-1).
In theory, you could use pointer arithmetic to make your array 1-based, and I am assuming you haven't; I wouldn't recommend it. However, you're code snipped is missing the actual definition of the list.

Calculating Prime Numbers using Sets, C++

I am trying to calculate the prime numbers using a set but when I do the calculation my iterator is jumping randomly.
I am trying to implement this method for an value of N=10.
Choose an integer n. This function will compute all prime numbers up
to n. First insert all numbers from 1 to n into a set. Then erase all
multiples of 2 (except 2); that is, 4, 6, 8, 10, 12, .... Erase all
multiples of 3, that is, 6, 9, 12, 15, ... . Go up to sqrt(n) . The
remaining numbers are all primes.
When I run my code, it erases 1 and then pos jumps to 4? I am not sure why this happens instead of it going to the value 2 which is the 2nd value in the set?
Also what happens after I erase a value that the iterator is pointing to, what does the iterator point to then and if I advance it where does it advance?
Here is the code:
set<int> sieveofEratosthenes(int n){ //n = 10
set<int> a;
set<int>::iterator pos = a.begin();
//generate set of values 1-10
for (int i = 1; i <= n; i++) {
a.insert(i);
if(pos != a.end())
pos++;
}
pos = a.begin();
//remove prime numbers
while (pos != a.end())
{
cout << "\nNew Iteration \n\n";
for (int i = 1; i < sqrt(n); i++) {
int val = *pos%i;
cout << "Pos = " << *pos << "\n";
cout << "I = " << i << "\n";
cout << *pos << "/" << i << "=" << val << "\n\n";
if (val == 0) {
a.erase(i);
}
}
pos++;
}
return a;
}
Your implementation is incorrect in that it is trying to combine the sieve algorithm with the straightforward algorithm of trying out divisors, and it does so unsuccessfully. You do not need to test divisibility to implement the sieve - in fact, that's a major contributor to the beauty of the algorithm! You do not even need multiplication.
a.erase(1);
pos = a.begin();
while (pos != a.end()) {
int current = *pos++;
// "remove" is the number to remove.
// Start it at twice the current number
int remove = current + current;
while (remove <= n) {
a.erase(remove);
// Add the current number to get the next item to remove
remove += current;
}
}
Demo.
When erasing elements inside a loop you have to be carefull with the indices. For example, when you erase the element at position 0, then the next element is now at position 0. Thus the loop should look something like this:
for (int i = 1; i < sqrt(n); /*no increment*/) {
/* ... */
if (val == 0) {
a.erase(i);
} else {
i++;
}
}
Actually, you also have to take care that the size of the set is shrinking while you erase elements. Thus you better use iterators:
for (auto it = a.begin(); i != a.end(); /*no increment*/) {
/* ... */
if (val == 0) {
a.erase(it);
} else {
it++;
}
}
PS: the above is not exactly what you need for the sieve, but it should be sufficient to demonstrate how to erase elements (I hope so).