Related
I have been trying a sorting method in which I subtract each number stored in an array by other elements in the same array. Then, I saw a pattern that the number of differences which come to be negative, is the rank or position of element in the Sorted one. But, things go wrong when I am using repeated entries.
My basic method is :
Take every element of the SampleArray.
subtract it from every element of the SampleArray
check if the difference comes to be negative.
if it is then, increase a variable called counter.
And use this counter as the position of element in sorted array.
For example: lets take (5,2,6,4)
first take 5, subtract it from each of the numbers which will give results (0,-3,1,-1), so counter will become 2, which will be the index of 5 in the sorted Array. And repeat it for each of the elements.
for 5, counter will be 2.
for 2, counter will be 0.
for 6, counter will be 3.
for 4, counter will be 1.
And hence the sorted Array will be {2,4,5,6}.
First, see the code :
#include <iostream>
using namespace std;
void sorting(int myArray[], int sizeofArray);
int main()
{
int checkArray[] = {5,4,2,20,12,13,8,6,10,15,0}; //my sample Arry
int sized;
sized=sizeof checkArray/sizeof(int);//to know the size
cout << sized << endl;
sorting(checkArray, sized);
}
void sorting(int myArray[], int sizeofArray)
{
int tempArray[sizeofArray];
for (int i=0; i<sizeofArray; i++)
{
int counter=0;
for (int j=0;j<sizeofArray; j++ )
{
int checkNum = myArray[j]-myArray[i];
if (checkNum<0)
counter++; //to know the numbers of negatives
else
counter+=0;
}
tempArray[counter]=myArray[i];
}
for (int x=0;x<sizeofArray; x++)
{
cout << tempArray[x] << " " ;
}
}
Now, if we run this program with entries with no repetitions then, it sorts out the array, But if we use repeated entries like
int checkArray[] = {8,2,4,4,6}
the tempArray gets its first element as 2 as counter will be zero.
the tempArray gets its second element as 4 as counter will be 1.
but, the tempArray can't get its third one as counter will be still 1, and thus prints some randomNo in place of this. (here the things go wrong).
Can you please suggest a method to solve this?
This is an odd way of writing insertion sort, https://en.wikipedia.org/wiki/Insertion_sort
I would assume you can change your condition to:
if (checkNum<0 || (checkNum==0 && j<i))
But I would suggest using a proper sorting routine instead
The idea is to separate duplicates by saying that if the values are the same we sort according to their order in the sequence; as if the sequence was a pair of the value and the sequence number (0, 1, 2, 3, 4, 5, ...).
The issue here is that for any 2 equally sized numbers the nested loop will return the same counter value. Thus for such a counter value tempArray[counter + 1] will never be initialized.
The way to solve this would be to maintain a vector<bool> denoting what each position had been written and write to the next valid position if that is the case.
But supporting a second vector is just going to make your O(n2) code slower. Consider using sort instead:
sort(begin(checkArray), end(checkArray))
I have a legacy code simplified as below:
std::map< int, std::vector<int> > list2D;
given a range( a number range for example between 10 - 90 ) I need need to filter the map first and eliminate the elements which are not between the number range. Again given the elements
20 -> {10,20}, 40 -> {500,200}, 100 -> {1,2,3} given the range is 10-90
I need the filter the one with 100.
Than I need to concatenate all the vectors left. So result will be {10,20,500,200} .
My legacy code was doing that with two for loops. I am planning to use lower_bound function for filtering step. But it seems I will still need a for loop. Simplified version can be seen below. To be honest the version with 2 for loops was looking more simple.
void simplifiedFoo ()
{
std::vector<int> returnVal;
std::map<int, std::vector<int> > list2D = {{20 , {10,20} },
{40 , {500,200}},
{100 , {1, 2, 3}} };
auto itlow=list2D.lower_bound (10);
auto itup=list2D.lower_bound (50);
if ( itlow != list2D.end() && itup == list2D.end() )// Don't like this if even not sure if it is correct.
--itup;
while ( itlow != itup) // How to avoid second loop
{
returnVal.insert(returnVal.end(), itlow->second.begin(), itlow->second.end());
++itlow;
}
for ( auto elem : returnVal)
std::cout << elem << " " ;
return 0;
}
What can be the preferable clean way for that(I need to implement this with vs2010)? Is there clean way I can achieve "reduce" functionality in C++ for my case?
Not sure if this is what you mean, but if you are looking for using "std oneliners", this would work:
std::for_each(itlow, itup,
[&returnVal](const auto& elem) {
returnVal.insert(returnVal.end(), elem.second.begin(), elem.second.end());
});
Now, can we call this "the preferable clean way for that"? I think it is debatable.
I think, you must use upper_bound for the second value, otherwise the upper value will be excluded
auto first = list2D.lower_bound(10);
auto last = list2D.upper_bound(90);
You don't need to check, if the lower iterator is != end() and the upper one is == end(). So your loop just becomes
for (; first != last; ++first) {
// ...
}
If the upper iterator were equal to end(), it would be equivalent to
for (; first != list2D.end(); ++first) {
// ...
}
I am going to start the new question. I posed the question yesterday and wanted to know what's the problem in my program. The program is given below and you people pointed out that this following program does only one pass of the sorting and needs an outer loop as well. At that time I was good like OK. But again when I looked the program I got confused and need to ask Why we need Outer loop as well for the sort since only a single loop can do the sorting(In my opinion). First see program below then I present my logic at the end of the program.
#include <iostream.h>
#include <conio.h>
using namespace std;
main()
{
int number[10];
int temp = 0;
int i = 0;
cout << "Please enter any ten numbers to sort one by one: "
<< "\n";
for (i = 0; i < 10; i++)
{
cin >> number[i];
}
i = 0;
for (i = 0; i < 9; i++)
{
if (number[i] > number[i + 1])
{
temp = number[i + 1];
number[i + 1] = number[i];
number[i] = temp;
}
}
i = 0;
cout << "The sorted numbers are given below:"
<< "\n";
for (i = 0; i < 10; i++)
{
cout << number[i] << "\n";
}
getch();
}
I think the ONLY loop with the bubble condition should do the sorting. Look at the following loop of the program:
for (i=0;i<9;i++)
if(number[i]>number[i+1])
{
temp=number[i+1];
number[i+1]=number[i];
number[i]=temp;
}
Now I explain what I am thinking what this loop "should" do. It will first compare number[0] with number[1]. If the condition is satisfied it will do what is in IF statement's body. Then i will be incremented by 1(i++). Then on next iteration the values compared will be number[1] with number[2]. Then why it does not happen and the loop exits after only pass? In other words may be I'm trying to ask IF statement does not repeat itself in for loop? In my opinion it does. I'm very thankful for help and views, my question might be of small level but that is how I will progress.
Let me give you an example let's only take 3 numbers. So you input
13, 3 ,1
Now you start sorting how you did it. so it compares 13 and 3
13 > 3 so switch both of them.
now we have.
3, 13, 1
Now it'll compare as you said the next pair = 13 and 1
13 > 1 so the new order would be
3, 1, 13
now your loop is finished and you missed to compare 3 and 1
Actually the first loop only sorts the greatest number!
since only a single loop can do the sorting(In my opinion)
This is not correct. Without getting to details, a constant number of loops is not enough to sort, since sorting is Omega(nlogn) problem. Meaning, an O(1) (constant, including 1) number of loops is not enough for it - for any algorithm1,2.
Consider the input
5, 4, 3, 2, 1
a single loop of bubble sort will do:
4, 5, 3, 2, 1
4, 3, 5, 2, 1
4, 3, 2, 5, 1
4, 3, 2, 1, 5
So the algorithm will end up with the array: [ 4, 3, 2, 1, 5], which is NOT sorted.
After one loop of bubble sort, you are only guaranteed to have the last element in place (which indeed happens in the example). The second iteration will make sure the 2 last elements are in place, and the nth iteration will make sure the array is indeed sorted, resulting in n loops, which is achieved via a nested loop.
(1) The outer loop is sometimes hidden as a recursive call (quick sort is an example where it happens) - but there is still a loop.
(2) Comparisons based algorithms, to be exact.
For bubble sorting a pass simply moves the largest element to the end of array. So you need n-1 passes to get a sorted array, thats why you need other loop. Now for your code 1 pass means
if(number[0]>number[0+1])
{
temp=number[0+1];
number[0+1]=number[0];
number[0]=temp;
}
if(number[1]>number[1+1])
{
temp=number[1+1];
number[1+1]=number[1];
number[1]=temp;
}
.....6 more times
if(number[8]>number[8+1])
{
temp=number[8+1];
number[8+1]=number[8];
number[8]=temp;
}
so as you can see IF statement repeats itself, its just that after all 9 IFs the largets element moves to the end of array
This is not correct because
The algorithm gets its name from the way smaller elements "bubble" to the top of the list. (Bubble sort)
So, at the end of the first loop, we get the smallest element. So, for complete sorting, we have to keep total n loops. (where n = total size of the numbers)
I got this question at an interview and at the end was told there was a more efficient way to do this but have still not been able to figure it out. You are passing into a function an array of integers and an integer for size of array. In the array you have a lot of numbers, some that repeat for example 1,7,4,8,2,6,8,3,7,9,10. You want to take that array and return an array where all the repeated numbers are put at the end of the array so the above array would turn into 1,7,4,8,2,6,3,9,10,8,7. The numbers I used are not important and I could not use a buffer array. I was going to use a BST, but the order of the numbers must be maintained(except for the duplicate numbers). I could not figure out how to use a hash table so I ended up using a double for loop(n^2 horrible I know). How would I do this more efficiently using c++. Not looking for code, just an idea of how to do it better.
In what follows:
arr is the input array;
seen is a hash set of numbers already encountered;
l is the index where the next unique element will be placed;
r is the index of the next element to be considered.
Since you're not looking for code, here is a pseudo-code solution (which happens to be valid Python):
arr = [1,7,4,8,2,6,8,3,7,9,10]
seen = set()
l = 0
r = 0
while True:
# advance `r` to the next not-yet-seen number
while r < len(arr) and arr[r] in seen:
r += 1
if r == len(arr): break
# add the number to the set
seen.add(arr[r])
# swap arr[l] with arr[r]
arr[l], arr[r] = arr[r], arr[l]
# advance `l`
l += 1
print arr
On your test case, this produces
[1, 7, 4, 8, 2, 6, 3, 9, 10, 8, 7]
I would use an additional map, where the key is the integer value from the array and the value is an integer set to 0 in the beginning. Now I would go through the array and increase the values in the map if the key is already in the map.
In the end I would go again through the array. When the integer from the array has a value of one in the map, I would not change anything. When it has a value of 2 or more in the map I would swap the integer from the array with the last one.
This should result in a runtime of O(n*log(n))
The way I would do this would be to create an array twice the size of the original and create a set of integers.
Then Loop through the original array, add each element to the set, if it already exists add it to the 2nd half of the new array, else add it to the first half of the new array.
In the end you would get an array that looks like: (using your example)
1,7,4,8,2,6,3,9,10,-,-,8,7,-,-,-,-,-,-,-,-,-
Then I would loop through the original array again and make each spot equal to the next non-null position (or 0'd or whatever you decided)
That would make the original array turn into your solution...
This ends up being O(n) which is about as efficient as I can think of
Edit: since you can not use another array, when you find a value that is already in the
set you can move every value after it forward one and set the last value equal to the
number you just checked, this would in effect do the same thing but with a lot more operations.
I have been out of touch for a while, but I'd probably start out with something like this and see how it scales with larger input. I know you didn't ask for code but in some cases it's easier to understand than an explanation.
Edit: Sorry I missed the requirement that you cannot use a buffer array.
// returns new vector with dupes a the end
std::vector<int> move_dupes_to_end(std::vector<int> input)
{
std::set<int> counter;
std::vector<int> result;
std::vector<int> repeats;
for (std::vector<int>::iterator i = input.begin(); i < input.end(); i++)
{
if (counter.find(*i) == counter.end())
result.push_back(*i);
else
repeats.push_back(*i);
counter.insert(*i);
}
result.insert(result.end(), repeats.begin(), repeats.end());
return result;
}
#include <algorithm>
T * array = [your array];
size_t size = [array size];
// Complexity:
sort( array, array + size ); // n * log(n) and could be threaded
// (if merge sort)
T * last = unique( array, array + size ); // n, but the elements after the last
// unique element are not defined
Check sort and unique.
void remove_dup(int* data, int count) {
int* L=data; //place to put next unique number
int* R=data+count; //place to place next repeat number
std::unordered_set<int> found(count); //keep track of what's been seen
for(int* cur=data; cur<R; ++cur) { //until we reach repeats
if(found.insert(*cur).second == false) { //if we've seen it
std::swap(*cur,*--R); //put at the beginning of the repeats
} else //or else
std::swap(*cur,*L++); //put it next in the unique list
}
std::reverse(R, data+count); //reverse the repeats to be in origional order
}
http://ideone.com/3choA
Not that I would turn in code this poorly commented. Also note that unordered_set probably uses it's own array internally, bigger than data. (This has been rewritten based on aix's answer, to be much faster)
If you know the bounds on what the integer values are, B, and the size of the integer array, SZ, then you can do something like the following:
Create an array of booleans seen_before with B elements, initialized to 0.
Create a result array result of integers with SZ elements.
Create two integers, one for front_pos = 0, one for back_pos = SZ - 1.
Iterate across the original list:
Set an integer variable val to the value of the current element
If seen_before[val] is set to 1, put the number at result[back_pos] then decrement back_pos
If seen_before[val] is not set to 1, put the number at result[front_pos] then increment front_pos and set seen_before[val] to 1.
Once you finish iterating across the main list, all the unique numbers will be at the front of the list while the duplicate numbers will be at the back. Fun part is that the entire process is done in one pass. Note that this only works if you know the bounds of the values appearing in the original array.
Edit: It was pointed out that there's no bounds on the integers used, so instead of initializing seen_before as an array with B elements, initialize it as a map<int, bool>, then continue as usual. That should get you n*log(n) performance.
This can be done by iterating the array & marking index of the first change.
later on swaping that mark index value with next unique value
& then incrementing that mark index for next swap
Java Implementation:
public static void solve() {
Integer[] arr = new Integer[] { 1, 7, 4, 8, 2, 6, 8, 3, 7, 9, 10 };
final HashSet<Integer> seen = new HashSet<Integer>();
int l = -1;
for (int i = 0; i < arr.length; i++) {
if (seen.contains(arr[i])) {
if (l == -1) {
l = i;
}
continue;
}
if (l > -1) {
final int temp = arr[i];
arr[i] = arr[l];
arr[l] = temp;
l++;
}
seen.add(arr[i]);
}
}
output is 1 7 4 8 2 6 3 9 10 8 7
It's ugly, but it meets the requirements of moving the duplicates to the end in place (no buffer array)
// warning, some light C++11
void dup2end(int* arr, size_t cnt)
{
std::set<int> k;
auto end = arr + cnt-1;
auto max = arr + cnt;
auto curr = arr;
while(curr < max)
{
auto res = k.insert(*curr);
// first time encountered
if(res.second)
{
++curr;
}
else
{
// duplicate:
std::swap(*curr, *end);
--end;
--max;
}
}
}
void move_duplicates_to_end(vector<int> &A) {
if(A.empty()) return;
int i = 0, tail = A.size()-1;
while(i <= tail) {
bool is_first = true; // check of current number is first-shown
for(int k=0; k<i; k++) { // always compare with numbers before A[i]
if(A[k] == A[i]) {
is_first = false;
break;
}
}
if(is_first == true) i++;
else {
int tmp = A[i]; // swap with tail
A[i] = A[tail];
A[tail] = tmp;
tail--;
}
}
If the input array is {1,7,4,8,2,6,8,3,7,9,10}, then the output is {1,7,4,8,2,6,10,3,9,7,8}. Comparing with your answer {1,7,4,8,2,6,3,9,10,8,7}, the first half is the same, while the right half is different, because I swap all duplicates with the tail of the array. As you mentioned, the order of the duplicates can be arbitrary.
I need a help in making an algorithm for solving one problem: There is a row with numbers which appear different times in the row, and i need to find the number that appears the most and how many times it's in the row, ex:
1-1-5-1-3-7-2-1-8-9-1-2
That would be 1 and it appears 5 times.
The algorithm should be fast (that's my problem).
Any ideas ?
What you're looking for is called the mode. You can sort the array, then look for the longest repeating sequence.
You could keep hash table and store a count of every element in that structure, like this
h[1] = 5
h[5] = 1
...
You can't get it any faster than in linear time, as you need to at least look at each number once.
If you know that the numbers are in a certain range, you can use an additional array to sum up the occurrences of each number, otherwise you'd need a hashtable, which is slightly slower.
Both of these need additional space though and you need to loop through the counts again in the end to get the result.
Unless you really have a huge amount of numbers and absolutely require O(n) runtime, you could simply sort your array of numbers. Then you can walk once through the numbers and simply keep the count of the current number and the number with the maximum of occurences in two variables. So you save yourself a lot of space, tradeing it off with a little bit of time.
There is an algorithm that solves your problem in linear time (linear in the number of items in the input). The idea is to use a hash table to associate to each value in the input a count indicating the number of times that value has been seen. You will have to profile against your expected input and see if this meets your needs.
Please note that this uses O(n) extra space. If this is not acceptable, you might want to consider sorting the input as others have proposed. That solution will be O(n log n) in time and O(1) in space.
Here's an implementation in C++ using std::tr1::unordered_map:
#include <iostream>
#include <unordered_map>
using namespace std;
using namespace std::tr1;
typedef std::tr1::unordered_map<int, int> map;
int main() {
map m;
int a[12] = {1, 1, 5, 1, 3, 7, 2, 1, 8, 9, 1, 2};
for(int i = 0; i < 12; i++) {
int key = a[i];
map::iterator it = m.find(key);
if(it == m.end()) {
m.insert(map::value_type(key, 1));
}
else {
it->second++;
}
}
int count = 0;
int value;
for(map::iterator it = m.begin(); it != m.end(); it++) {
if(it->second > count) {
count = it->second;
value = it->first;
}
}
cout << "Value: " << value << endl;
cout << "Count: " << count << endl;
}
The algorithm works using the input integers as keys in a hashtable to a count of the number of times each integer appears. Thus the key (pun intended) to the algorithm is building this hash table:
int key = a[i];
map::iterator it = m.find(key);
if(it == m.end()) {
m.insert(map::value_type(key, 1));
}
else {
it->second++;
}
So here we are looking at the ith element in our input list. Then what we do is we look to see if we've already seen it. If we haven't, we add a new value to our hash table containing this new integer, and an initial count of one indicating this is our first time seeing it. Otherwise, we increment the counter associated to this value.
Once we have built this table, it's simply a matter of running through the values to find one that appears the most:
int count = 0;
int value;
for(map::iterator it = m.begin(); it != m.end(); it++) {
if(it->second > count) {
count = it->second;
value = it->first;
}
}
Currently there is no logic to handle the case of two distinct values appearing the same number of times and that number of times being the largest amongst all the values. You can handle that yourself depending on your needs.
Here is a simple one, that is O(n log n):
Sort the vector # O(n log n)
Create vars: int MOST, VAL, CURRENT
for ELEMENT in LIST:
CURRENT += 1
if CURRENT >= MOST:
MOST = CURRENT
VAL = ELEMENT
return (VAL, MOST)
There are few methods:
Universal method is "sort it and find longest subsequence" which is O(nlog n). The fastest sort algorithm is quicksort ( average, the worst is O( n^2 ) ). Also you can use heapsort but it is quite slower in average case but asymptotic complexity is O( n log n ) also in the worst case.
If you have some information about numbers then you can use some tricks. If numbers are from the limited range then you can use part of algorithm for counting sort. It is O( n ).
If this isn't your case, there are some other sort algorithms which can do it in linear time but no one is universal.
The best time complexity you can get here is O(n). You have to look through all elements, because the last element may be the one which determines the mode.
The solution depends on whether time or space is more important.
If space is more important, then you can sort the list then find the longest sequence of consecutive elements.
If time is more important, you can iterate through the list, keeping a count of the number of occurences of each element (e.g. hashing element -> count). While doing this, keep track of the element with max count, switching if necessary.
If you also happen know that the mode is the majority element (i.e. there are more than n/2 elements in the array with this value), then you can get O(n) speed and O(1) space efficiency.
Generic C++ solution:
#include <algorithm>
#include <iterator>
#include <map>
#include <utility>
template<class T, class U>
struct less_second
{
bool operator()(const std::pair<T, U>& x, const std::pair<T, U>& y)
{
return x.second < y.second;
}
};
template<class Iterator>
std::pair<typename std::iterator_traits<Iterator>::value_type, int>
most_frequent(Iterator begin, Iterator end)
{
typedef typename std::iterator_traits<Iterator>::value_type vt;
std::map<vt, int> frequency;
for (; begin != end; ++begin) ++frequency[*begin];
return *std::max_element(frequency.begin(), frequency.end(),
less_second<vt, int>());
}
#include <iostream>
int main()
{
int array[] = {1, 1, 5, 1, 3, 7, 2, 1, 8, 9, 1, 2};
std::pair<int, int> result = most_frequent(array, array + 12);
std::cout << result.first << " appears " << result.second << " times.\n";
}
Haskell solution:
import qualified Data.Map as Map
import Data.List (maximumBy)
import Data.Function (on)
count = foldl step Map.empty where
step frequency x = Map.alter next x frequency
next Nothing = Just 1
next (Just n) = Just (n+1)
most_frequent = maximumBy (compare `on` snd) . Map.toList . count
example = most_frequent [1, 1, 5, 1, 3, 7, 2, 1, 8, 9, 1, 2]
Shorter Haskell solution, with help from stack overflow:
import qualified Data.Map as Map
import Data.List (maximumBy)
import Data.Function (on)
most_frequent = maximumBy (compare `on` snd) . Map.toList .
Map.fromListWith (+) . flip zip (repeat 1)
example = most_frequent [1, 1, 5, 1, 3, 7, 2, 1, 8, 9, 1, 2]
The solution below gives you the count of each number. It is a better approach than using map in terms of time and space. If you need to get the number that appeared most number of times, then this is not better than previous ones.
EDIT: This approach is useful for unsigned numbers only and the numbers starting from 1.
std::string row = "1,1,5,1,3,7,2,1,8,9,1,2";
const unsigned size = row.size();
int* arr = new int[size];
memset(arr, 0, size*sizeof(int));
for (int i = 0; i < size; i++)
{
if (row[i] != ',')
{
int val = row[i] - '0';
arr[val - 1]++;
}
}
for (int i = 0; i < size; i++)
std::cout << i + 1 << "-->" << arr[i] << std::endl;
Since this is homework I think it's OK to supply a solution in a different language.
In Smalltalk something like the following would be a good starting point:
SequenceableCollection>>mode
| aBag maxCount mode |
aBag := Bag new
addAll: self;
yourself.
aBag valuesAndCountsDo: [ :val :count |
(maxCount isNil or: [ count > maxCount ])
ifTrue: [ mode := val.
maxCount := count ]].
^mode
As time is going by, the language evolves.
We have now many more language constructs that make life simpler
namespace aliases
CTAD (Class Template Argument Deduction)
more modern containers like std::unordered_map
range based for loops
the std::ranges library
projections
using statment
structured bindings
more modern algorithms
We could now come up with the following code:
#include <iostream>
#include <vector>
#include <unordered_map>
#include <algorithm>
namespace rng = std::ranges;
int main() {
// Demo data
std::vector data{ 2, 456, 34, 3456, 2, 435, 2, 456, 2 };
// Count values
using Counter = std::unordered_map<decltype (data)::value_type, std::size_t> ;
Counter counter{}; for (const auto& d : data) counter[d]++;
// Get max
const auto& [value, count] = *rng::max_element(counter, {}, &Counter::value_type::second);
// Show output
std::cout << '\n' << value << " found " << count << " times\n";
}