Sorting an integer array of 100 elements having only 3 elements in it - c++

Suppose I have an array of 100 numbers. The only distinct values in the array are 1, 2 and 3. The values are randomly ordered throughout the array. For instance, the array might be populated as:
int values[100];
for (int i = 0; i < 100; i++)
values[i] = 1 + rand() % 3;
How can I efficiently sort an array like this?

The fastest solution is not to "sort" at all:
Run through the array and count the number of occurrences of 1,2 and 3. These counts should hopefully fit in registers...
Fill the array with the right number of 1s, 2s and 3s, overwriting whatever is there already.
At the end you will have a fully sorted array.
In general, this can be a useful O(n) sorting algorithm when you have a very small range of possible values compared to the size of the array.

Dutch National flag algorithm is the commonly cited algorithm for this and is actually the partition step in one of the variants of quicksort (1 corresponds to less than, 2 to equal to and 3 to greater than). In that variant, you don't need to sort the middle portion.

Related

Big 0 notation for duplicate function, C++

What is the Big 0 notation for the function description in the screenshot.
It would take O(n) to go through all the numbers but once it finds the numbers and removes them what would that be? Would the removed parts be a constant A? and then would the function have to iterate through the numbers again?
This is what I am thinking for Big O
T(n) = n + a + (n-a) or something involving having to iterate through (n-a) number of steps after the first duplicate is found, then would big O be O(n)?
Big O notation is considering the worst case. Let's say we need to remove all duplicates from the array A=[1..n]. The algorithm will start with the first element and check every remaining element - there are n-1 of them. Since all values happen to be different it won't remove any from the array.
Next, the algorithm selects the second element and checks the remaining n-2 elements in the array. And so on.
When the algorithm arrives at the final element it is done. The total number of comparisions is the sum of (n-1) + (n-2) + ... + 2 + 1 + 0. Through the power of maths, this sum becomes (n-1)*n/2 and the dominating term is n^2 so the algorithm is O(n^2).
This algorithm is O(n^2). Because for each element in the array you are iterating over the array and counting the occurrences of that element.
foreach item in array
count = 0
foreach other in array
if item == other
count += 1
if count > 1
remove item
As you see there are two nested loops in this algorithm which results in O(n*n).
Removed items doesn't affect the worst case. Consider an array containing unique elements. No elements is being removed in this array.
Note: A naive implementation of this algorithm could result in O(n^3) complexity.
You started with first element you will go through all elements in the vector thats n-1 you will do that for n time its (n * n-1)/2 for worst case n time is the best case (all elements are 4)

2 player team knowing maximum moves

Given a list of N players who are to play a 2 player game. Each of them are either well versed in making a particular move or they are not. Find out the maximum number of moves a 2-player team can know.
And also find out how many teams can know that maximum number of moves?
Example Let we have 4 players and 5 moves with ith player is versed in jth move if a[i][j] is 1 otherwise it is 0.
10101
11100
11010
00101
Here maximum number of moves a 2-player team can know is 5 and their are two teams that can know that maximum number of moves.
Explanation : (1, 3) and (3, 4) know all the 5 moves. So the maximal moves a 2-player team knows is 5, and only 2 teams can acheive this.
My approach : For each pair of players i check if any of the players is versed in ith move or not and for each player maintain the maximum pairs he can make with other players with his local maximum move combination.
vector<int> pairmemo;
for(int i=0;i<n;i++){
int mymax=INT_MIN;
int countpairs=0;
for(int j=i+1;j<n;j++){
int count=0;
for(int k=0;k<m;k++){
if(arr[i][k]==1 || arr[j][k]==1)
{
count++;
}
}
if(mymax<count){
mymax=count;
countpairs=0;
}
if(mymax==count){
countpairs++;
}
}
pairmemo.push_back(countpairs);
maxmemo.push_back(mymax);
}
Overall maximum of all N players is answer and count is corresponding sum of the pairs being calculated.
for(int i=0;i<n;i++){
if(maxi<maxmemo[i])
maxi=maxmemo[i];
}
int countmaxi=0;
for(int i=0;i<n;i++){
if(maxmemo[i]==maxi){
countmaxi+=pairmemo[i];
}
}
cout<<maxi<<"\n";
cout<<countmaxi<<"\n";
Time complexity : O((N^2)*M)
Code :
How can i improve it?
Constraints : N<= 3000 and M<=1000
If you represent each set of moves by a very large integer, the problem boils down to finding pair of players (I, J) which have maximum number of bits set in MovesI OR MovesJ.
So, you can use bit-packing and compress all the information on moves in Long integer array. It would take 16 unsigned long integers to store according to the constraints. So, for each pair of players you OR the corresponding arrays and count number of ones. This would take O(N^2 * 16) which would run pretty fast given the constraints.
Example:
Lets say given matrix is
11010
00011
and you used 4-bit integer for packing it.
It would look like:
1101-0000
0001-1000
that is,
13,0
1,8
After OR the moves array for 2 player team becomes 13,8, now count the bits which are one. You have to optimize the counting of bits also, for that read the accepted answer here, otherwise the factor M would appear in complexity. Just maintain one count variable and one maxNumberOfBitsSet variable as you process the pairs.
What Ill do is:
1. Do logical OR between all the possible pairs - O(N^2) and store it's SUM in a 2D array with the symmetric diagonal ignored. (thats we save half of the calc - see example)
2. find the max value in the 2D Array (can be done while doing task 1) -> O(1)
3. count how many cells in the 2D array equals to the maximum value in task 2 O(N^2)
sum: 2*O(N^2)+ O(1) => O(N^2)
Example (using the data in the question (with letters indexes):
A[10101] B[11100] C[11010] D[00101]
Task 1:
[A|B] = 11101 = SUM(4)
[A|C] = 11111 = SUM(5)
[A|D] = 10101 = SUM(3)
[B|C] = 11110 = SUM(4)
[B|D] = 11101 = SUM(4)
[C|D] = 11111 = SUM(5)
Task 2 (Done while is done 1):
Max = 5
Task 3:
Count = 2
By the way, O(N^2) is the minimum possible since you HAVE to check all the possible pairs.
Since you have to find all solutions, unless you find a way to find a count without actually finding the solutions themselves, you have to actually look at or eliminate all possible solutions. So the worst case will always be O(N^2*M), which I'll call O(n^3) as long as N and M are both big and similar size.
However, you can hope for much better performance on the average case by pruning.
Don't check every case. Find ways to eliminate combinations without checking them.
I would sum and store the total number of moves known to each player, and sort the array rows by that value. That should provide an easy check for exiting the loop early. Sorting at O(n log n) should be basically free in an O(n^3) algorithm.
Use Priyank's basic idea, except with bitsets, since you obviously can't use a fixed integer type with 3000 bits.
You may benefit from making a second array of bitsets for the columns, and use that as a mask for pruning players.

Find pair of elements in integer array such that abs(v[i]-v[j]) is minimized

Lets say we have int array with 5 elements: 1, 2, 3, 4, 5
What I need to do is to find minimum abs value of array's elements' subtraction:
We need to check like that
1-2 2-3 3-4 4-5
1-3 2-4 3-5
1-4 2-5
1-5
And find minimum abs value of these subtractions. We can find it with 2 fors. The question is, is there any algorithm for finding value with one and only for?
sort the list and subtract nearest two elements
The provably best performing solution is assymptotically linear O(n) up until constant factors.
This means that the time taken is proportional to the number of the elements in the array (which of course is the best we can do as we at least have to read every element of the array, which already takes O(n) time).
Here is one such O(n) solution (which also uses O(1) space if the list can be modified in-place):
int mindiff(const vector<int>& v)
{
IntRadixSort(v.begin(), v.end());
int best = MAX_INT;
for (int i = 0; i < v.size()-1; i++)
{
int diff = abs(v[i]-v[i+1]);
if (diff < best)
best = diff;
}
return best;
}
IntRadixSort is a linear time fixed-width integer sorting algorithm defined here:
http://en.wikipedia.org/wiki/Radix_sort
The concept is that you leverage the fixed-bitwidth nature of ints by paritioning them in a series of fixed passes on the bit positions. ie partition them on the hi bit (32nd), then on the next highest (31st), then on the next (30th), and so on - which only takes linear time.
The problem is equivalent to sorting. Any sorting algorithm could be used, and at the end, return the difference between the nearest elements. A final pass over the data could be used to find that difference, or it could be maintained during the sort. Before the data is sorted the min difference between adjacent elements will be an upper bound.
So to do it without two loops, use a sorting algorithm that does not have two loops. In a way it feels like semantics, but recursive sorting algorithms will do it with only one loop. If this issue is the n(n+1)/2 subtractions required by the simple two loop case, you can use an O(n log n) algorithm.
No, unless you know the list is sorted, you need two
Its simple Iterate in a for loop
keep 2 variable "minpos and maxpos " and " minneg" and "maxneg"
check for the sign of the value you encounter and store maximum positive in maxpos
and minimum +ve number in "minpos" do the same by checking in if case for number
less than zero. Now take the difference of maxpos-minpos in one variable and
maxneg and minneg in one variable and print the larger of the two . You will get
desired.
I believe you definitely know how to find max and min in one for loop
correction :- The above one is to find max difference in case of minimum you need to
take max and second max instead of max and min :)
This might be help you:
end=4;
subtractmin;
m=0;
for(i=1;i<end;i++){
if(abs(a[m]-a[i+m])<subtractmin)
subtractmin=abs(a[m]-a[i+m];}
if(m<4){
m=m+1
end=end-1;
i=m+2;
}}

Generate a new element different from 1000 elements of an array

I was asked this questions in an interview. Consider the scenario of punched cards, where each punched card has 64 bit pattern. I was suggested each card as an int since each int is a collection of bits.
Also, to be considered that I have an array which already contains 1000 such cards. I have to generate a new element everytime which is different from the previous 1000 cards. The integers(aka cards) in the array are not necessarily sorted.
Even more, how would that be possible the question was for C++, where does the 64 bit int comes from and how can I generate this new card from the array where the element to be generated is different from all the elements already present in the array?
There are 264 64 bit integers, a number that is so much
larger than 1000 that the simplest solution would be to just generate a
random 64 bit number, and then verify that it isn't in the table of
already generated numbers. (The probability that it is is
infinitesimal, but you might as well be sure.)
Since most random number generators do not generate 64 bit values, you
are left with either writing your own, or (much simpler), combining the
values, say by generating 8 random bytes, and memcpying them into a
uint64_t.
As for verifying that the number isn't already present, std::find is
just fine for one or two new numbers; if you have to do a lot of
lookups, sorting the table and using a binary search would be
worthwhile. Or some sort of a hash table.
I may be missing something, but most of the other answers appear to me as overly complicated.
Just sort the original array and then start counting from zero: if the current count is in the array skip it, otherwise you have your next number. This algorithm is O(n), where n is the number of newly generated numbers: both sorting the array and skipping existing numbers are constants. Here's an example:
#include <algorithm>
#include <iostream>
unsigned array[] = { 98, 1, 24, 66, 20, 70, 6, 33, 5, 41 };
unsigned count = 0;
unsigned index = 0;
int main() {
std::sort(array, array + 10);
while ( count < 100 ) {
if ( count > array[index] )
++index;
else {
if ( count < array[index] )
std::cout << count << std::endl;
++count;
}
}
}
Here's an O(n) algorithm:
int64 generateNewValue(list_of_cards)
{
return find_max(list_of_cards)+1;
}
Note: As #amit points out below, this will fail if INT64_MAX is already in the list.
As far as I'm aware, this is the only way you're going to get O(n). If you want to deal with that (fairly important) edge case, then you're going to have to do some kind of proper sort or search, which will take you to O(n log n).
#arne is almost there. What you need is a self-balancing interval tree, which can be built in O(n lg n) time.
Then take the top node, which will store some interval [i, j]. By the properties of an interval tree, both i-1 and j+1 are valid candidates for a new key, unless i = UINT64_MIN or j = UINT64_MAX. If both are true, then you've stored 2^64 elements and you can't possibly generate a new element. Store the new element, which takes O(lg n) worst-case time.
I.e.: init takes O(n lg n), generate takes O(lg n). Both are worst-case figures. The greatest thing about this approach is that the top node will keep "growing" (storing larger intervals) and merging with its successor or predecessor, so the tree will actually shrink in terms of memory use and eventually the time per operation decays to O(1). You also won't waste any numbers, so you can keep generating until you've got 2^64 of them.
This algorithm has O(N lg N) initialisation, O(1) query and O(N) memory usage. I assume you have some integer type which I will refer to as int64 and that it can represent the integers [0, int64_max].
Sort the numbers
Create a linked list containing intervals [u, v]
Insert [1, first number - 1]
For each of the remaining numbers, insert [prev number + 1, current number - 1]
Insert [last number + 1, int64_max]
You now have a list representing the numbers which are not used. You can simply iterate over them to generate new numbers.
I think the way to go is to use some kind of hashing. So you store your cards in some buckets based on lets say on MOD operation. Until you create some sort of indexing you are stucked with looping over the whole array.
IF you have a look on HashSet implementation in java you might get a clue.
Edit: I assume you wanted them to be random numbers, if you don't mind sequence MAX+1 below is good solution :)
You could build a binary tree of the already existing elements and traverse it until you find a node whose depth is not 64 and which has less than two child nodes. You can then construct a "missing" child node and have a new element. The should be fairly quick, in the order of about O(n) if I'm not mistaken.
bool seen[1001] = { false };
for each element of the original array
if the element is in the range 0..1000
seen[element] = true
find the index for the first false value in seen
Initialization:
Don't sort the list.
Create a new array 1000 long containing 0..999.
Iterate the list and, if any number is in the range 0..999, invalidate it in the new array by replacing the value in the new array with the value of the first item in the list.
Insertion:
Use an incrementing index to the new array. If the value in the new array at this index is not the value of the first element in the list, add it to the list, else check the value from the next position in the new array.
When the new array is used up, refill it using 1000..1999 and invalidating existing values as above. Yes, this is looping over the list, but it doesn't have to be done for each insertion.
Near O(1) until the list gets so large that occasionally iterating it for invalidation of the 'new' new array becomes significant. Maybe you could mitigate this by using a new array that grows, maybee always the size of the list?
Rgds,
Martin
Put them all into a hash table of size > 1000, and find the empty cell (this is the parking problem). Generate a key for that. This will of course work better for bigger table size. The table needs only 1-bit entries.
EDIT: this is the pigeonhole principle.
This needs "modulo tablesize" (or some other "semi-invertible" function) for a hash function.
unsigned hashtab[1001] = {0,};
unsigned long long long long numbers[1000] = { ... };
void init (void)
{
unsigned idx;
for (idx=0; idx < 1000; idx++) {
hashtab [ numbers[idx] % 1001 ] += 1; }
}
unsigned long long long long generate(void)
{
unsigned idx;
for (idx = 0; idx < 1001; idx++) {
if ( !hashtab [ idx] ) break; }
return idx + rand() * 1001;
}
Based on the solution here: question on array and number
Since there are 1000 numbers, if we consider their remainders with 1001, at least one remainder will be missing. We can pick that as our missing number.
So we maintain an array of counts: C[1001], which will maintain the number of integers with remainder r (upon dividing by 1001) in C[r].
We also maintain a set of numbers for which C[j] is 0 (say using a linked list).
When we move the window over, we decrement the count of the first element (say remainder i), i.e. decrement C[i]. If C[i] becomes zero we add i to the set of numbers. We update the C array with the new number we add.
If we need one number, we just pick a random element from the set of j for which C[j] is 0.
This is O(1) for new numbers and O(n) initially.
This is similar to other solutions but not quite.
How about something simple like this:
1) Partition the array into numbers equal and below 1000 and above
2) If all the numbers fit within the lower partition then choose 1001 (or any number greater than 1000) and we're done.
3) Otherwise we know that there must exist a number between 1 and 1000 that doesn't exist within the lower partition.
4) Create a 1000 element array of bools, or a 1000-element long bitfield, or whatnot and initialize the array to all 0's
5) For each integer in the lower partition, use its value as an index into the array/bitfield and set the corresponding bool to true (ie: do a radix sort)
6) Go over the array/bitfield and pick any unset value's index as the solution
This works in O(n) time, or since we've bounded everything by 1000, technically it's O(1), but O(n) time and space in general. There are three passes over the data, which isn't necessarily the most elegant approach, but the complexity remains O(n).
you can create a new array with the numbers that are not in the original array, then just pick one from this new array.
¿O(1)?

Fast way to pick randomly from a set, with each entry picked only once?

I'm working on a program to solve the n queens problem (the problem of putting n chess queens on an n x n chessboard such that none of them is able to capture any other using the standard chess queen's moves). I am using a heuristic algorithm, and it starts by placing one queen in each row and picking a column randomly out of the columns that are not already occupied. I feel that this step is an opportunity for optimization. Here is the code (in C++):
vector<int> colsleft;
//fills the vector sequentially with integer values
for (int c=0; c < size; c++)
colsleft.push_back(c);
for (int i=0; i < size; i++)
{
vector<int>::iterator randplace = colsleft.begin() + rand()%colsleft.size();
/* chboard is an integer array, with each entry representing a row
and holding the column position of the queen in that row */
chboard[i] = *randplace;
colsleft.erase(randplace);
}
If it is not clear from the code: I start by building a vector containing an integer for each column. Then, for each row, I pick a random entry in the vector, assign its value to that row's entry in chboard[]. I then remove that entry from the vector so it is not available for any other queens.
I'm curious about methods that could use arrays and pointers instead of a vector. Or <list>s? Is there a better way of filling the vector sequentially, other than the for loop? I would love to hear some suggestions!
The following should fulfill your needs:
#include <algorithm>
...
int randplace[size];
for (int i = 0; i < size; i ++)
randplace[i] = i;
random_shuffle(randplace, randplace + size);
You can do the same stuff with vectors, too, if you wish.
Source: http://gethelp.devx.com/techtips/cpp_pro/10min/10min1299.asp
Couple of random answers to some of your questions :):
As far as I know, there's no way to fill an array with consecutive values without iterating over it first. HOWEVER, if you really just need consecutive values, you do not need to fill the array - just use the cell indices as the values: a[0] is 0 and a[100] is 100 - when you get a random number, treat the number as the value.
You can implement the same with a list<> and remove cells you already hit, or...
For better performance, rather than removing cells, why not put an "already used" value in them (like -1) and check for that. Say you get a random number like 73, and a[73] contains -1, you just get a new random number.
Finally, describing item 3 reminded me of a re-hashing function. Perhaps you can implement your algorithm as a hash-table?
Your colsleft.erase(randplace); line is really inefficient, because erasing an element in the middle of the vector requires shifting all the ones after it. A more efficient approach that will satisfy your needs in this case is to simply swap the element with the one at index (size - i - 1) (the element whose index will be outside the range in the next iteration, so we "bring" that element into the middle, and swap the used one out).
And then we don't even need to bother deleting that element -- the end of the array will accumulate the "chosen" elements. And now we've basically implemented an in-place Knuth shuffle.