Generating an array of integers without violating constraints - c++

I am struggling with a problem for hours. It is a constraint satisfaction problem. Let me describe it on a simple example:
Assume there is an array of integers with length 8. Every cell can take certain values. First 4 cells can take 0, 1 or 2 and the other half can take 0 or 1. These 3 arrays can be some examples.
{2,1,0,2,1,1,0,1}
{2,2,1,0,0,1,0,0}
{0,0,0,2,0,0,0,1}
However there are some constraints to construct the arrays as follows:
constraint1 = {1,-,-,-,-,1,-,-} // !(cell2=1 && cell6=1) cell2 and cell6 can not be in these format.
constraint2 = {0,-,-,-,-,-,-,0} // !(cell1=0 && cell8=0)
constraint3 = {-,-,-,2,1,1,-,-} // !(cell4=2 && cell5=1 && cell6=1)
constraint4 = {1,1,-,-,-,-,-,-} // !(cell1=1 && cell2=1)
For better understanding;
{0,1,1,2,0,1,0,0} // this is not valid, because it violates the constraint2
{1,1,2,2,1,1,0,1} // this is not valid, because it violates the constraint3 and constraint4
{1,1,0,0,0,1,0,0} // this is not valid, because it violates the constraint4
I need to generate an array of integers which does not violates any of the given constraints.
In my approach;
1) Create an array (called myArray) and initialize every cell to -1
2) Count the number of cells which are used in constraints. Above example, cell1 is used 3 times, cell2 is used 1 time, cell3 is not used, so on so forth.
3) Choose the cell which is used more in constraints (it is cell1, used 3 times)
4) Find the distribution of numbers in this cell. (In cell1, 1 is used 2 times and 0 is used 1 time)
5) Change this chosen cell in myArray to the number which is used less. (In cell1, since 0 is used less than 1, cell1 in myArray will be 0)
6) Delete all the constraints from the list which has 1 or 2 in their cell1.
7) Go to step 2 and do same steps until all constraints are eliminated
The idea of this algorithm is to chose the cell and its value in such a way that it will eliminate more constraints.
However, this algorithm is not working, when the number of constraints are higher.
Important Note: This is just a simple example. In normal case, length of the array is longer (averagely 100) and number of constraints is higher (more than 200). My input is length of the array, N constraints and the values each cell can take.
Is there anyone who has better idea to solve this problem?

Here is a code that I have written in C# to generate a random matrix and then to remove the constraint in the matrix.
class Program
{
static void Main(string[] args)
{
int[] inputData = new int[4] { 3, 7, 3, 3 };
int matrixRowSize = 6;
/////////////////////////// Constraints
int []constraint1 = new int[4] { 1, -1, -1, 2}; // here is the constaint that i want to remove.
// note the constaints could be more than 1, so there could be a generic method
Random r = new Random();
int[,] Random_matrix = new int[matrixRowSize, inputData.Length];
///////////// generate random matrix
for (int i = 0; i < inputData.Length; i++)
{
for (int j = 0; j < matrixRowSize; j++)
{
int k = r.Next(0, inputData[i]);
Random_matrix[j, i] = k;
}
}
}

Related

Checking occurrence of values within two vectors simultaneously

My goal is find how many times a complex number exists in the two vectors.
wr_den : vector with real part
wi_den : vector with imaginary part
ordem_den[0] : vectors number of elements (in this case is 3)
Example:
wr_den[0] = 1 wi_den[0] = 1
wr_den[1] = 1 wi_den[1] = 1
wr_den[2] = 1 wi_den[2] = 0
Result:
index 0: 2
index 1: 2
index 2: 1
My code
for (it = 0; it < ordem_den[0]; it++)
{
times = 0;
for(contador = 0; contador < ordem_den[0]; contador++)
{
p = wr_den[it];
x = wr_den[contador];
y = wi_den[it];
t = wi_den[contador];
if ((p == x) && (t == y))
{
times++;
}
}
}
A std::multiset would probably be the thing here, assuming that the perceived problem is O(n^2) time of current code.
Just first iterate throught the index positions, putting those numbers in the set. Then iterate again, now checking number of occurrences of each number at each given index position.
You don't specify the number types used. Be aware that non-integer floating point values may not compare as equal even though they look equal on output. That may or may not be a problem.

n-th or Arbitrary Combination of a Large Set

Say I have a set of numbers from [0, ....., 499]. Combinations are currently being generated sequentially using the C++ std::next_permutation. For reference, the size of each tuple I am pulling out is 3, so I am returning sequential results such as [0,1,2], [0,1,3], [0,1,4], ... [497,498,499].
Now, I want to parallelize the code that this is sitting in, so a sequential generation of these combinations will no longer work. Are there any existing algorithms for computing the ith combination of 3 from 500 numbers?
I want to make sure that each thread, regardless of the iterations of the loop it gets, can compute a standalone combination based on the i it is iterating with. So if I want the combination for i=38 in thread 1, I can compute [1,2,5] while simultaneously computing i=0 in thread 2 as [0,1,2].
EDIT Below statement is irrelevant, I mixed myself up
I've looked at algorithms that utilize factorials to narrow down each individual element from left to right, but I can't use these as 500! sure won't fit into memory. Any suggestions?
Here is my shot:
int k = 527; //The kth combination is calculated
int N=500; //Number of Elements you have
int a=0,b=1,c=2; //a,b,c are the numbers you get out
while(k >= (N-a-1)*(N-a-2)/2){
k -= (N-a-1)*(N-a-2)/2;
a++;
}
b= a+1;
while(k >= N-1-b){
k -= N-1-b;
b++;
}
c = b+1+k;
cout << "["<<a<<","<<b<<","<<c<<"]"<<endl; //The result
Got this thinking about how many combinations there are until the next number is increased. However it only works for three elements. I can't guarantee that it is correct. Would be cool if you compare it to your results and give some feedback.
If you are looking for a way to obtain the lexicographic index or rank of a unique combination instead of a permutation, then your problem falls under the binomial coefficient. The binomial coefficient handles problems of choosing unique combinations in groups of K with a total of N items.
I have written a class in C# to handle common functions for working with the binomial coefficient. It performs the following tasks:
Outputs all the K-indexes in a nice format for any N choose K to a file. The K-indexes can be substituted with more descriptive strings or letters.
Converts the K-indexes to the proper lexicographic index or rank of an entry in the sorted binomial coefficient table. This technique is much faster than older published techniques that rely on iteration. It does this by using a mathematical property inherent in Pascal's Triangle and is very efficient compared to iterating over the set.
Converts the index in a sorted binomial coefficient table to the corresponding K-indexes. I believe it is also faster than older iterative solutions.
Uses Mark Dominus method to calculate the binomial coefficient, which is much less likely to overflow and works with larger numbers.
The class is written in .NET C# and provides a way to manage the objects related to the problem (if any) by using a generic list. The constructor of this class takes a bool value called InitTable that when true will create a generic list to hold the objects to be managed. If this value is false, then it will not create the table. The table does not need to be created in order to use the 4 above methods. Accessor methods are provided to access the table.
There is an associated test class which shows how to use the class and its methods. It has been extensively tested with 2 cases and there are no known bugs.
To read about this class and download the code, see Tablizing The Binomial Coeffieicent.
The following tested code will iterate through each unique combinations:
public void Test10Choose5()
{
String S;
int Loop;
int N = 500; // Total number of elements in the set.
int K = 3; // Total number of elements in each group.
// Create the bin coeff object required to get all
// the combos for this N choose K combination.
BinCoeff<int> BC = new BinCoeff<int>(N, K, false);
int NumCombos = BinCoeff<int>.GetBinCoeff(N, K);
// The Kindexes array specifies the indexes for a lexigraphic element.
int[] KIndexes = new int[K];
StringBuilder SB = new StringBuilder();
// Loop thru all the combinations for this N choose K case.
for (int Combo = 0; Combo < NumCombos; Combo++)
{
// Get the k-indexes for this combination.
BC.GetKIndexes(Combo, KIndexes);
// Verify that the Kindexes returned can be used to retrive the
// rank or lexigraphic order of the KIndexes in the table.
int Val = BC.GetIndex(true, KIndexes);
if (Val != Combo)
{
S = "Val of " + Val.ToString() + " != Combo Value of " + Combo.ToString();
Console.WriteLine(S);
}
SB.Remove(0, SB.Length);
for (Loop = 0; Loop < K; Loop++)
{
SB.Append(KIndexes[Loop].ToString());
if (Loop < K - 1)
SB.Append(" ");
}
S = "KIndexes = " + SB.ToString();
Console.WriteLine(S);
}
}
You should be able to port this class over fairly easily to C++. You probably will not have to port over the generic part of the class to accomplish your goals. Your test case of 500 choose 3 yields 20,708,500 unique combinations, which will fit in a 4 byte int. If 500 choose 3 is simply an example case and you need to choose combinations greater than 3, then you will have to use longs or perhaps fixed point int.
You can describe a particular selection of 3 out of 500 objects as a triple (i, j, k), where i is a number from 0 to 499 (the index of the first number), j ranges from 0 to 498 (the index of the second, skipping over whichever number was first), and k ranges from 0 to 497 (index of the last, skipping both previously-selected numbers). Given that, it's actually pretty easy to enumerate all the possible selections: starting with (0,0,0), increment k until it gets to its maximum value, then increment j and reset k to 0 and so on, until j gets to its maximum value, and so on, until j gets to its own maximum value; then increment i and reset both j and k and continue.
If this description sounds familiar, it's because it's exactly the same way that incrementing a base-10 number works, except that the base is much funkier, and in fact the base varies from digit to digit. You can use this insight to implement a very compact version of the idea: for any integer n from 0 to 500*499*498, you can get:
struct {
int i, j, k;
} triple;
triple AsTriple(int n) {
triple result;
result.k = n % 498;
n = n / 498;
result.j = n % 499;
n = n / 499;
result.i = n % 500; // unnecessary, any legal n will already be between 0 and 499
return result;
}
void PrintSelections(triple t) {
int i, j, k;
i = t.i;
j = t.j + (i <= j ? 1 : 0);
k = t.k + (i <= k ? 1 : 0) + (j <= k ? 1 : 0);
std::cout << "[" << i << "," << j << "," << k << "]" << std::endl;
}
void PrintRange(int start, int end) {
for (int i = start; i < end; ++i) {
PrintSelections(AsTriple(i));
}
}
Now to shard, you can just take the numbers from 0 to 500*499*498, divide them into subranges in any way you'd like, and have each shard compute the permutation for each value in its subrange.
This trick is very handy for any problem in which you need to enumerate subsets.

How to produce a random number sequence that doesn't produce more than X consecutive elements

Ok, I really don't know how to frame the question properly because I barely have any idea how to describe what I want in one sentence and I apologize.
Let me get straight to the point and you can just skip the rest cause I just want to show that I've tried something and not coming here to ask a question on a whim.
I need an algorithm that produces 6 random numbers where it may not produce more than 2 consecutive numbers in that sequence.
example: 3 3 4 4 2 1
^FINE.
example: 3 3 3 4 4 2
^NO! NO! WRONG!
Obviously, I have no idea how to do this without tripping over myself constantly.
Is there a STL or Boost feature that can do this? Or maybe someone here knows how to concoct an algorithm for it. That would be awesome.
What I'm trying to do and what I've tried.(the part you can skip)
This is in C++. I'm trying to make a Panel de Pon/Tetris Attack/Puzzle League whatever clone for practice. The game has a 6 block row and 3 or more matching blocks will destroy the blocks. Here's a video in case you're not familiar.
When a new row comes from the bottom it must not come out with 3 horizontal matching blocks or else it will automatically disappear. Something I do not want for horizontal. Vertical is fine though.
I've tried to accomplish just that and it appears I can't get it right. When I start the game chunks of blocks are missing because it detects a match when it shouldn't. My method is more than likely heavy handed and too convoluted as you'll see.
enum BlockType {EMPTY, STAR, UP_TRIANGLE, DOWN_TRIANGLE, CIRCLE, HEART, DIAMOND};
vector<Block> BlockField::ConstructRow()
{
vector<Block> row;
int type = (rand() % 6)+1;
for (int i=0;i<6;i++)
{
row.push_back(Block(type));
type = (rand() % 6) +1;
}
// must be in order from last to first of the enumeration
RowCheck(row, diamond_match);
RowCheck(row, heart_match);
RowCheck(row, circle_match);
RowCheck(row, downtriangle_match);
RowCheck(row, uptriangle_match);
RowCheck(row, star_match);
return row;
}
void BlockField::RowCheck(vector<Block> &row, Block blockCheckArray[3])
{
vector<Block>::iterator block1 = row.begin();
vector<Block>::iterator block2 = row.begin()+1;
vector<Block>::iterator block3 = row.begin()+2;
vector<Block>::iterator block4 = row.begin()+3;
vector<Block>::iterator block5 = row.begin()+4;
vector<Block>::iterator block6 = row.begin()+5;
int bt1 = (*block1).BlockType();
int bt2 = (*block2).BlockType();
int bt3 = (*block3).BlockType();
int bt4 = (*block4).BlockType();
int type = 0;
if (equal(block1, block4, blockCheckArray))
{
type = bt1 - 1;
if (type <= 0) type = 6;
(*block1).AssignBlockType(type);
}
else if (equal(block2, block5, blockCheckArray))
{
type = bt2 - 1;
if (type <= 0) type = 6;
(*block2).AssignBlockType(type);
}
else if (equal(block3, block6, blockCheckArray))
{
type = bt3 - 1;
if (type == bt3) type--;
if (type <= 0) type = 6;
(*block3).AssignBlockType(type);
}
else if (equal(block4, row.end(), blockCheckArray))
{
type = bt4 - 1;
if (type == bt3) type--;
if (type <= 0) type = 6;
(*block4).AssignBlockType(type);
}
}
Sigh, I'm not sure if it helps to show this...At least it shows that I've tried something.
Basically, I construct the row by assigning random block types, described by the BlockType enum, to a Block object's constructor(a Block object has blockType and a position).
Then I use a RowCheck function to see if there's 3 consecutive blockTypes in one row and I have do this for all block types. The *_match variables are arrays of 3 Block objects with the same block type. If I do find that there are 3 consecutive block types then, I just simply subtract the first value by one. However if I do that I might end up inadvertently producing another 3 match so I just make sure the block types are going in order from greatest to least.
Ok, it's crappy, it's convoluted and it doesn't work! That's why I need your help.
It should suffice to keep record of the previous two values, and loop when the newly generated one matches both of the previous values.
For an arbitrary run length, it would make sense to size a history buffer on the fly and do the comparisons in a loop as well. But this should be close to matching your requirements.
int type, type_old, type_older;
type_older = (rand() % 6)+1;
row.push_back(Block(type_older));
type_old = (rand() % 6)+1;
row.push_back(Block(type_old));
for (int i=2; i<6; i++)
{
type = (rand() % 6) +1;
while ((type == type_old) && (type == type_older)) {
type = (rand() % 6) +1;
}
row.push_back(Block(type));
type_older = type_old;
type_old = type;
}
Idea no 1.
while(sequence doesn't satisfy you)
generate a new sequence
Idea no 2.
Precalculate all allowable sequences (there are about ~250K of them)
randomly choose an index and take that element.
The second idea requires much memory, but is fast. The first one isn't slow either because there is a veeery little probability that your while loop will iterate more than once or twice. HTH
Most solutions seen so far involve a potentially infinite loop. May I suggest a different approch?
// generates a random number between 1 and 6
// but never the same number three times in a row
int dice()
{
static int a = -2;
static int b = -1;
int c;
if (a != b)
{
// last two were different, pick any of the 6 numbers
c = rand() % 6 + 1;
}
else
{
// last two were equal, so we need to choose from 5 numbers only
c = rand() % 5;
// prevent the same number from being generated again
if (c == b) c = 6;
}
a = b;
b = c;
return c;
}
The interesting part is the else block. If the last two numbers were equal, there is only 5 different numbers to choose from, so I use rand() % 5 instead of rand() % 6. This call could still produce the same number, and it also cannot produce the 6, so I simply map that number to 6.
Solution with simple do-while loop (good enough for most cases):
vector<Block> row;
int type = (rand() % 6) + 1, new_type;
int repetition = 0;
for (int i = 0; i < 6; i++)
{
row.push_back(Block(type));
do {
new_type = (rand() % 6) + 1;
} while (repetition == MAX_REPETITION && new_type == type);
repetition = new_type == type ? repetition + 1 : 0;
type = new_type;
}
Solution without loop (for those who dislike non-deterministic nature of previous solution):
vector<Block> row;
int type = (rand() % 6) + 1, new_type;
int repetition = 0;
for (int i = 0; i < 6; i++)
{
row.push_back(Block(type));
if (repetition != MAX_REPETITION)
new_type = (rand() % 6) + 1;
else
{
new_type = (rand() % 5) + 1;
if (new_type >= type)
new_type++;
}
repetition = new_type == type ? repetition + 1 : 0;
type = new_type;
}
In both solutions MAX_REPETITION is equal to 1 for your case.
How about initializing a six element array to [1, 2, 3, 4, 5, 6] and randomly interchanging them for awhile? That is guaranteed to have no duplicates.
Lots of answers say "once you detect Xs in a row, recalculate the last one until you don't get an X".... In practice for a game like this, that approach is millions of times faster than you need for "real-time" human interaction, so just do it!
But, you're obviously uncomfortable with it and looking for something more inherently "bounded" and elegant. So, given you're generating numbers from 1..6, when you detect 2 Xs you already know the next one could be a duplicate, so there are only 5 valid values: generate a random number from 1 to 5, and if it's >= X, increment it by one more.
That works a bit like this:
1..6 -> 3
1..6 -> 3
"oh no, we've got two 3s in a row"
1..5 -> ?
< "X"/3 i.e. 1, 2 use as is
>= "X" 3, 4, 5, add 1 to produce 4, 5 or 6.
Then you know the last two elements differ... the latter would take up the first spot when you resume checking for 2 elements in a row....
vector<BlockType> constructRow()
{
vector<BlockType> row;
row.push_back(STAR); row.push_back(STAR);
row.push_back(UP_TRIANGLE); row.push_back(UP_TRIANGLE);
row.push_back(DOWN_TRIANGLE); row.push_back(DOWN_TRIANGLE);
row.push_back(CIRCLE); row.push_back(CIRCLE);
row.push_back(HEART); row.push_back(HEART);
row.push_back(DIAMOND); row.push_back(DIAMOND);
do
{
random_shuffle(row.begin(), row.end());
}while(rowCheckFails(row));
return row;
}
The idea is to use random_shuffle() here. You need to implement rowCheckFails() that satisfies the requirement.
EDIT
I may not understand your requirement properly. That's why I've put 2 of each block type in the row. You may need to put more.
I think you would be better served to hide your random number generation behind a method or function. It could be a method or function that returns three random numbers at once, making sure that there are at least two distinct numbers in your output. It could also be a stream generator that makes sure that it never outputs three identical numbers in a row.
int[] get_random() {
int[] ret;
ret[0] = rand() % 6 + 1;
ret[1] = rand() % 6 + 1;
ret[2] = rand() % 6 + 1;
if (ret[0] == ret[1] && ret[1] == ret[2]) {
int replacement;
do {
replacement = rand() % 6 + 1;
} while (replacement == ret[0]);
ret[rand() % 3] = replacement;
}
return ret;
}
If you wanted six random numbers (it's a little difficult for me to tell, and the video was just baffling :) then it'll be a little more effort to generate the if condition:
for (int i=0; i<4; i++) {
if (ret[i] == ret[i+1] && ret[i+1] == ret[i+2])
/* three in a row */
If you always change ret[1] (the middle of the three) you'll never have three-in-a-row as a result of the change, but the output won't be random either: X Y X will happen more often than X X Y because it can happen by random chance and by being forced in the event of X X X.
First some comments on the above solutions.
There is nothing wrong with the techniques that involve rejecting a random value if it isn't satisfactory. This is an example of rejection sampling, a widely used technique. For example, several algorithms for generating a random gaussian involve rejection sampling. One, the polar rejection method, involves repeatedly drawing a pair of numbers from U(-1,1) until both are non-zero and do not lie outside the unit circle. This throws out over 21% of the pairs. After finding a satisfactory pair, a simple transformation yields a pair of gaussian deviates. (The polar rejection method is now falling out of favor, being replaced by the ziggurat algorithm. That too uses a rejection sampling.)
There is something very much wrong with rand() % 6. Don't do this. Ever. The low order bits from a random number generator, even a good random number generator, are not quite as "random" as are the high order bits.
There is something very much wrong with rand(), period. Most compiler writers apparently don't know beans about producing random numbers. Don't use rand().
Now a solution that uses the Boost random number library:
vector<Block> BlockField::ConstructRow(
unsigned int max_run) // Maximum number of consecutive duplicates allowed
{
// The Mersenne Twister produces high quality random numbers ...
boost::mt19937 rng;
// ... but we want numbers between 1 and 6 ...
boost::uniform_int<> six(1,6);
// ... so we need to glue the rng to our desired output.
boost::variate_generator<boost::mt19937&, boost::uniform_int<> >
roll_die(rng, six);
vector<Block> row;
int prev = 0;
int run_length = 0;
for (int ii=0; ii<6; ++ii) {
int next;
do {
next = roll_die();
run_length = (next == prev) ? run_length+1 : 0;
} while (run_length > max_run);
row.push_back(Block(next));
prev = next;
}
return row;
}
I know that this already has many answers, but a thought just occurred to me. You could have 7 arrays, one with all 6 digits, and one for each missing a given digit. Like this:
int v[7][6] = {
{1, 2, 3, 4, 5, 6 },
{2, 3, 4, 5, 6, 0 }, // zeros in here to make the code simpler,
{1, 3, 4, 5, 6, 0 }, // they are never used
{1, 2, 4, 5, 6, 0 },
{1, 2, 3, 5, 6, 0 },
{1, 2, 3, 4, 6, 0 },
{1, 2, 3, 4, 5, 0 }
};
Then you can have a 2 level history. Finally to generate a number, if your match history is less than the max, shuffle v[0] and take v[0][0]. Otherwise, shuffle the first 5 values from v[n] and take v[n][0]. Something like this:
#include <algorithm>
int generate() {
static int prev = -1;
static int repeat_count = 1;
static int v[7][6] = {
{1, 2, 3, 4, 5, 6 },
{2, 3, 4, 5, 6, 0 }, // zeros in here to make the code simpler,
{1, 3, 4, 5, 6, 0 }, // they are never used
{1, 2, 4, 5, 6, 0 },
{1, 2, 3, 5, 6, 0 },
{1, 2, 3, 4, 6, 0 },
{1, 2, 3, 4, 5, 0 }
};
int r;
if(repeat_count < 2) {
std::random_shuffle(v[0], v[0] + 6);
r = v[0][0];
} else {
std::random_shuffle(v[prev], v[prev] + 5);
r = v[prev][0];
}
if(r == prev) {
++repeat_count;
} else {
repeat_count = 1;
}
prev = r;
return r;
}
This should result in good randomness (not reliant of rand() % N), no infinite loops, and should be fairly efficient given the small amount of numbers that we are shuffling each time.
Note, due to the use of statics, this is not thread safe, that may be fine for your usages, if it is not, then you probably want to wrap this up in an object, each with its own state.

algorithm: find count of numbers within a given range

given an unsorted number array where there can be duplicates, pre-process the array so that to find the count of numbers within a given range, the time is O(1).
For example, 7,2,3,2,4,1,4,6. The count of numbers both >= 2 and <= 5 is 5. (2,2,3,4,4).
Sort the array. For each element in the sorted array, insert that element into a hash table, with the value of the element as the key, and its position in the array as the associated value. Any values that are skipped, you'll need to insert as well.
To find the number of items in a range, look up the position of the value at each end of the range in the hash table, and subtract the lower from the upper to find the size of the range.
This sounds suspiciously like one of those clever interview questions some interviewers like to ask, which is usually associated with hints along the way to see how you think.
Regardless... one possible way of implementing this is to make a list of the counts of numbers equal to or less than the list index.
For example, from your list above, generate the list: 0, 1, 3, 4, 6, 6, 7, 8. Then you can count the numbers between 2 and 5 by subtracting list[1] from list[5].
Since we need to access in O(1), the data structure needed would be memory-intensive.
With Hash Table, in worst case access would take O(n)
My Solution:
Build a 2D matrix.
array = {2,3,2,4,1,4,6} Range of numbers = 0 to 6 so n = 7
So we've to create nxn matrix.
array[i][i] represents total count of element = i
so array[4][4] = 2 (since 4 appears 2 times in array)
array[5][5] = 0
array[5][2] = count of numbers both >= 2 and <= 5 = 5
//preprocessing stage 1: Would populate a[i][i] with total count of element = i
a[n][n]={0};
for(i=0;i<=n;i++){
a[i][i]++;
}
//stage 2
for(i=1;i<=n;i++)
for(j=0;j<i;j++)
a[i][j] = a[i-1][j] + a[i][i];
//we are just adding count of element=i to each value in i-1th row and we get ith row.
Now (5,2) would query for a[5][2] and would give answer in O(1)
int main()
{
int arr[8]={7,2,3,2,4,1,4,6};
int count[9];
int total=0;
memset(count,0, sizeof(count));
for(int i=0;i<8;i++)
count[arr[i]]++;
for(int k=0;k<9;k++)
{
if(k>=2 && k<=5 && count[k]>0 )
{
total= total+count[k] ;
}
}
printf("%d:",total);
return 0;
}

C++: function creation using array

Write a function which has:
input: array of pairs (unique id and weight) length of N, K =< N
output: K random unique ids (from input array)
Note: being called many times frequency of appearing of some Id in the output should be greater the more weight it has.
Example: id with weight of 5 should appear in the output 5 times more often than id with weight of 1. Also, the amount of memory allocated should be known at compile time, i.e. no additional memory should be allocated.
My question is: how to solve this task?
EDIT
thanks for responses everybody!
currently I can't understand how weight of pair affects frequency of appearance of pair in the output, can you give me more clear, "for dummy" explanation of how it works?
Assuming a good enough random number generator:
Sum the weights (total_weight)
Repeat K times:
Pick a number between 0 and total_weight (selection)
Find the first pair where the sum of all the weights from the beginning of the array to that pair is greater than or equal to selection
Write the first part of the pair to the output
You need enough storage to store the total weight.
Ok so you are given input as follows:
(3, 7)
(1, 2)
(2, 5)
(4, 1)
(5, 2)
And you want to pick a random number so that the weight of each id is reflected in the picking, i.e. pick a random number from the following list:
3 3 3 3 3 3 3 1 1 2 2 2 2 2 4 5 5
Initially, I created a temporary array but this can be done in memory as well, you can calculate the size of the list by summing all the weights up = X, in this example = 17
Pick a random number between [0, X-1], and calculate which which id should be returned by looping through the list, doing a cumulative addition on the weights. Say I have a random number 8
(3, 7) total = 7 which is < 8
(1, 2) total = 9 which is >= 8 **boom** 1 is your id!
Now since you need K random unique ids you can create a hashtable from initial array passed to you to work with. Once you find an id, remove it from the hash and proceed with algorithm. Edit Note that you create the hashmap initially only once! You algorithm will work on this instead of looking through the array. I did not put in in the top to keep the answer clear
As long as your random calculation is not using any extra memory secretly, you will need to store K random pickings, which are <= N and a copy of the original array so max space requirements at runtime are O(2*N)
Asymptotic runtime is :
O(n) : create copy of original array into hastable +
(
O(n) : calculate sum of weights +
O(1) : calculate random between range +
O(n) : cumulative totals
) * K random pickings
= O(n*k) overall
This is a good question :)
This solution works with non-integer weights and uses constant space (ie: space complexity = O(1)). It does, however modify the input array, but the only difference in the end is that the elements will be in a different order.
Add the weight of each input to the weight of the following input, starting from the bottom working your way up. Now each weight is actually the sum of that input's weight and all of the previous weights.
sum_weights = the sum of all of the weights, and n = N.
K times:
Choose a random number r in the range [0,sum_weights)
binary search the first n elements for the first slot where the (now summed) weight is greater than or equal to r, i.
Add input[i].id to output.
Subtract input[i-1].weight from input[i].weight (unless i == 0). Now subtract input[i].weight from to following (> i) input weights and also sum_weight.
Move input[i] to position [n-1] (sliding the intervening elements down one slot). This is the expensive part, as it's O(N) and we do it K times. You can skip this step on the last iteration.
subtract 1 from n
Fix back all of the weights from n-1 down to 1 by subtracting the preceding input's weight
Time complexity is O(K*N). The expensive part (of the time complexity) is shuffling the chosen elements. I suspect there's a clever way to avoid that, but haven't thought of anything yet.
Update
It's unclear what the question means by "output: K random unique Ids". The solution above assumes that this meant that the output ids are supposed to be unique/distinct, but if that's not the case then the problem is even simpler:
Add the weight of each input to the weight of the following input, starting from the bottom working your way up. Now each weight is actually the sum of that input's weight and all of the previous weights.
sum_weights = the sum of all of the weights, and n = N.
K times:
Choose a random number r in the range [0,sum_weights)
binary search the first n elements for the first slot where the (now summed) weight is greater than or equal to r, i.
Add input[i].id to output.
Fix back all of the weights from n-1 down to 1 by subtracting the preceding input's weight
Time complexity is O(K*log(N)).
My short answer: in no way.
Just because the problem definition is incorrect. As Axn brilliantly noticed:
There is a little bit of contradiction going on in the requirement. It states that K <= N. But as K approaches N, the frequency requirement will be contradicted by the Uniqueness requirement. Worst case, if K=N, all elements will be returned (i.e appear with same frequency), irrespective of their weight.
Anyway, when K is pretty small relative to N, calculated frequencies will be pretty close to theoretical values.
The task may be splitted on two subtasks:
Generate random numbers with a given distribution (specified by weights)
Generate unique random numbers
Generate random numbers with a given distribution
Calculate sum of weights (sumOfWeights)
Generate random number from the range [1; sumOfWeights]
Find an array element where the sum of weights from the beginning of the array is greater than or equal to the generated random number
Code
#include <iostream>
#include <cstdlib>
#include <ctime>
// 0 - id, 1 - weight
typedef unsigned Pair[2];
unsigned Random(Pair* i_set, unsigned* i_indexes, unsigned i_size)
{
unsigned sumOfWeights = 0;
for (unsigned i = 0; i < i_size; ++i)
{
const unsigned index = i_indexes[i];
sumOfWeights += i_set[index][2];
}
const unsigned random = rand() % sumOfWeights + 1;
sumOfWeights = 0;
unsigned i = 0;
for (; i < i_size; ++i)
{
const unsigned index = i_indexes[i];
sumOfWeights += i_set[index][3];
if (sumOfWeights >= random)
{
break;
}
}
return i;
}
Generate unique random numbers
Well known Durstenfeld-Fisher-Yates algorithm may be used for generation unique random numbers. See this great explanation.
It requires N bytes of space, so if N value is defined at compiled time, we are able to allocate necessary space at compile time.
Now, we have to combine these two algorithms. We just need to use our own Random() function instead of standard rand() in unique numbers generation algorithm.
Code
template<unsigned N, unsigned K>
void Generate(Pair (&i_set)[N], unsigned (&o_res)[K])
{
unsigned deck[N];
for (unsigned i = 0; i < N; ++i)
{
deck[i] = i;
}
unsigned max = N - 1;
for (unsigned i = 0; i < K; ++i)
{
const unsigned index = Random(i_set, deck, max + 1);
std::swap(deck[max], deck[index]);
o_res[i] = i_set[deck[max]][0];
--max;
}
}
Usage
int main()
{
srand((unsigned)time(0));
const unsigned c_N = 5; // N
const unsigned c_K = 2; // K
Pair input[c_N] = {{0, 5}, {1, 3}, {2, 2}, {3, 5}, {4, 4}}; // input array
unsigned result[c_K] = {};
const unsigned c_total = 1000000; // number of iterations
unsigned counts[c_N] = {0}; // frequency counters
for (unsigned i = 0; i < c_total; ++i)
{
Generate<c_N, c_K>(input, result);
for (unsigned j = 0; j < c_K; ++j)
{
++counts[result[j]];
}
}
unsigned sumOfWeights = 0;
for (unsigned i = 0; i < c_N; ++i)
{
sumOfWeights += input[i][1];
}
for (unsigned i = 0; i < c_N; ++i)
{
std::cout << (double)counts[i]/c_K/c_total // empirical frequency
<< " | "
<< (double)input[i][1]/sumOfWeights // expected frequency
<< std::endl;
}
return 0;
}
Output
N = 5, K = 2
Frequencies
Empiricical | Expected
0.253813 | 0.263158
0.16584 | 0.157895
0.113878 | 0.105263
0.253582 | 0.263158
0.212888 | 0.210526
Corner case when weights are actually ignored
N = 5, K = 5
Frequencies
Empiricical | Expected
0.2 | 0.263158
0.2 | 0.157895
0.2 | 0.105263
0.2 | 0.263158
0.2 | 0.210526
I do assume that the ids in the output must be unique. This makes this problem a specific instance of random sampling problems.
The first approach that I can think of solves this in O(N^2) time, using O(N) memory (The input array itself plus constant memory).
I Assume that the weights are possitive.
Let A be the array of pairs.
1) Set N to be A.length
2) calculate the sum of all weights W.
3) Loop K times
3.1) r = rand(0,W)
3.2) loop on A and find the first index i such that A[1].w + ...+ A[i].w <= r < A[1].w + ... + A[i+1].w
3.3) add A[i].id to output
3.4) A[i] = A[N-1] (or swap if the array contents should be preserved)
3.5) N = N - 1
3.6) W = W - A[i].w