Picking random coordinates without duplicates? - c++

I want to choose random coordinates on a 8x8 board. The x and y coordinates can only be -8. -6, -4, -2, 0, 2, 4, 6, and 8. I want to choose random coordinates for 20 objects but I don't want any 2 objects to have the same coordinates. Program away in C++!

You've only got 9 possible values for each coordinate, so that's 81 possible points in all. The simplest solution would be to just enumerate all possible points (eg: in an array or vector), and then randomly select 20.
You can randomly select 20 by picking an index from 0 to 80, swapping that element of the array with index 80, and then randomly picking an index from 0 to 79, swapping that with index 79, and so on 20 times. Then the last 20 elements of your array will be 20 distinct random points.

Take all of the coordinate pairs in your set, and toss them into a list, and generate a random permutation of the list (standard algorithms exist for this, such as the algorithm Laurence is suggesting). Take the first 20 elements of the permutation.

If you can enumerate all coordinates on the board, you can use any sampling algorithm. You're on a 9x9 grid; just pick 20 values out of the range [0,80] and then translate them into grid coordinates:
// Say the number picked is "n"
int x = ((n % 9) - 4) * 2;
int y = ((n / 9) - 4) * 2;
You can use any sampling algorithm to generate the ns; check out the answers to this question, for example.
The advantage of this approach over generating the points explicitly is that you can save quite a bit of memory (and processing time) on large grids. If they're really large and you're picking a small simple, the obvious algorithm works fine too: Just pick a random point and try again if you already picked it. The only problem with this algorithm is that it can end up doing quite many retries if you're selecting a large fraction of a set.

Putting Laurence's algorithm in program. Its working fine.
#include <iostream>
#include <vector>
#include <ctime>
using namespace std;
//To store x and y coordinate of a point
struct Point
{
int x, y;
};
int main()
{
vector<Point> v;
Point p;
//Populate vector with 81 combinations.
for(int i = -8; i < 10; i += 2)
{
for(int j = -8; j < 10; j += 2)
{
p.x = i;
p.y = j;
v.push_back(p);
}
}
srand(time(NULL));
int lastIndex = 80;
for(int i = 0; i < 20; i++)
{
int randNum = rand() % (81-i);
//Swap to isolate chosen points.
std::swap(v[randNum], v[lastIndex-i]);
}
//Print chosen random coordinates
cout<<"Random points chosen are "<<endl;
for(int i = 61; i < 81; i++)
{
Point p = v[i];
cout<<p.x<<"\t"<<p.y<<endl;
}
}

You could use std::random_shuffle for instance, since you have a finite number of integer coordinates. So just shuffle that set of vectors/positions around. You can also pass your own RNG to random_shuffle as a function object.
Example:
#include <algorithm> //for copy and random_shuffle
#include <utility> //for pair and make_pair
#include <vector>
...
std::vector<std::pair<int, int> > coords;
std::vector<std::pair<int, int> > coords20(20);
for(int y=-8; y<=8; y+=2)
for(int x=-8; x<=8; x+=2)
coords.push_back(std::make_pair(x,y));
std::random_shuffle(coords.begin(), coords.end());
std::copy(coords.begin(), coords.begin() + 20, coords20.begin());

Related

Finding the number of sub arrays that have a sum of K

I am trying to find the number of sub arrays that have a sum equal to k:
int subarraySum(vector<int>& nums, int k)
{
int start, end, curr_sum = 0, count = 0;
start = 0, end = 0;
while (end < (int)nums.size())
{
curr_sum = curr_sum + nums[end];
end++;
while (start < end && curr_sum >= k)
{
if (curr_sum == k)
count++;
curr_sum = curr_sum - nums[start];
start++;
}
}
return count;
}
The above code I have written, works for most cases, but fails for the following:
array = {-1, -1, 1} with k = 0
I have tried to add another while loop to iterate from the start and go up the array until it reaches the end:
int subarraySum(vector<int>& nums, int k)
{
int start, end, curr_sum = 0, count = 0;
start = 0, end = 0;
while (end < (int)nums.size())
{
curr_sum = curr_sum + nums[end];
end++;
while (start < end && curr_sum >= k)
{
if (curr_sum == k)
count++;
curr_sum = curr_sum - nums[start];
start++;
}
}
while (start < end)
{
if (curr_sum == k)
count++;
curr_sum = curr_sum - nums[start];
start++;
}
return count;
}
Why is this not working? I am sliding the window until the last element is reached, which should have found a sum equal to k? How can I solve this issue?
Unfortunately, you did not program a sliding window in the correct way. And a sliding window is not really a solution for this problem. One of your main issues is, that you do not move the start of the window based on the proper conditions. You always sum up and wait until the sum is greater than the search value.
This will not really work. Especially for your example -1, -1, 1. The running sum of this is: -1, -2, -1 and you do not see the 0, although it is there. You may have the idea to write while (start < end && curr_sum != k), but this will also not work, because you handle the start pointer not correctly.
Your approach will lead to the brute force solution that typically takes something like N*N loop operations, where N is the size of the array. This, because we need a double nested loop.
That will of course always work, but maybe very time-consuming, and, in the end, too slow.
Anyway. Let us implement that. We will start from each value in the std::vector and try out all sub arrays starting from the beginning value. We must evaluate all following values in the std::vector, because for example the last value could be a big negative number and bring down the sum again to the search value.
We could implement this for example like the following:
#include <iostream>
#include <vector>
using namespace std;
int subarraySum(vector<int>& numbers, int searchSumValue) {
// Here we will store the result
int resultingCount{};
// Iterate over all values in the array. So, use all different start values
for (std::size_t i{}; i < numbers.size(); ++i) {
// Here we stor the running sum of the elements in the vector
int sum{ numbers[i] };
// Check for trivial case. A one-element sub-array does already match the search value
if (sum == searchSumValue) ++resultingCount;
// Now we build all subarrays beginning with the start value
for (std::size_t k{ i + 1 }; k < numbers.size(); ++k) {
sum += numbers[k];
if (sum == searchSumValue) ++resultingCount;
}
}
return resultingCount;
}
int main() {
vector v{ -1,-1,1 };
std::cout << subarraySum(v, 0);
}
.
But, as said, the above is often too slow for big vectors and there is indeed a better solution available, which is based on a DP (dynamic programming) algorithm.
It uses so-called prefix sums, running sums, based on the running sum before the current evaluated value.
We need to show an example. Let's use a std::vector with 5 values {1,2,3,4,5}. And we want to look subarrays with a sum of 9.
We can “guess” that there are 2 subarrays: {2,3,4} and {4,5} that have a sum of 9.
Let us investigate further
Index 0 1 2 3 4
Value 1 2 3 4 5
We can now add a running sum and see, how much delta we have between the current evaluated element and the left neighbor or over-next neighbor and so on. And if we have a delta that is equal to our search value, then we must have a subarray building this sum.
Running Sum 1 3 6 10 15
Deltas of 2 3 4 5 against next left
Running sum 5 7 9 against next next left
9 12 against next next next left
Example {2,3,4}. If we evaluate the 4 with a running sum of 10, and subtract the search value 9, then we get the previous running sum 1. “1+9=10” all values are there.
Example {4,5}. If we evaluate the 5 with a running sum of 15, and subtract the search value 9, then we get the previous running sum = 6. “6+9=15” all values are there.
We can find all solutions using the same approach.
So, the only thing we need to do, is to subtract the search value from the current running sum and see, if we have this running sum already calculated before.
Like: “Search-Value” + “previously Calculated Sum” = “Current Running Sum”.
Or: “Current Running Sum” – “Search-Value” = “previously Calculated Sum”
Again, we need to do the subtraction and check, if we already calculated such a sum previously.
So, we need to store all previously calculated running sums. And, because such a sum may appear more than one, we need to find occurrences of equal running sums and count them.
It is very hard to digest, and you need to think a while to understand.
With the above wisdom, you can draft the below potential solution.
#include <iostream>
#include <vector>
#include <unordered_map>
int subarraySum(std::vector<int>& numbers, int searchSumValue) {
// Here we will store the result
int resultingSubarrayCount{};
// Here we will stor all running sums and how ofthen their value appeared
std::unordered_map<int, int> countOfRunningSums;
// Continuosly calculating the running sum
int runningSum{};
// And initialize the first value
countOfRunningSums[runningSum] = 1;
// Now iterate over all values in the vector
for (const int n : numbers) {
// Calculate the running sum
runningSum += n;
// Check, if we have the searched value already available
// And add the number of occurences to our resulting number of subarrays
resultingSubarrayCount += countOfRunningSums[runningSum - searchSumValue];
// Store the new running sum. Respectively. Add 1 to the counter, if the running sum was alreadyy existing
countOfRunningSums[runningSum]++;
}
return resultingSubarrayCount;
}
int main() {
std::vector v{ 1,2,3,4,5 };
std::cout << subarraySum(v, 9);
}

OpenCL 1D range loop without knowledge of global size

I was wondering how can I iterate over a loop with a any number of work items (per group is irrelevant)
I have 3 arrays and one of them is 2-dimensional(a matrix). The first array contains a set of integers. The matrix is filled with another set of (repeated and random) integers.
The third one is only to store the results.
I need to search for the farest pair's numbers of occurrences of a number, from the first array, in the matrix.
To summarize:
A: Matrix with random numbers
num: Array with numbers to search in A
d: Array with maximum distances of pairs of each number from num
The algorithm is simple(as I don't need to optimize it), I only compare calculated Manhattan distances and keep the maximum value.
To keep it simple, it does the following (C-like pseudo code):
for(number in num){
maxDistance = 0
for(row in A){
for(column in A){
//calculateDistance is a function to another nested loop like this
//it returns the max found distance if it is, and 0 otherwise
currentDistance = calculateDistance(row, column, max)
if(currentDistance > maxDistance){
maxDistance = currentDistance
}
}
}
}
As you can see there is no dependent data between iterations. I tried to assign each work item a slice of the matrix A, but still doesn't convince me.
IMPORTANT: The kernel must be executed with only one dimension for the problem.
Any ideas? How can I use the global id to make multiple search at once?
Edit:
I added the code to clear away any doubt.
Here is the kernel:
__kernel void maxDistances(int N, __constant int *A, int n, __constant int *numbers, __global int *distances)
{
//N is matrix row and col size
//A the matrix
//n the total count of numbers to be searched
//numbers is the array containing the numbers
//distances is the array containing the computed distances
size_t id = get_global_id(0);
int slice = (N*N)/get_global_size(0);
for(int idx_num = 0; idx_num < n; idx_num++)
{
int number = numbers[idx_num];
int currentDistance = 0;
int maxDistance = 0;
for(int c = id*slice; c < (id+1)*slice; c++)
{
int i = c/N;
int j = c%N;
if(*CELL(A,N,i,j) == number){
coord_t coords;
coords.i = i;
coords.j = j;
//bestDistance is a function with 2 nested loop iterating over
//rows and column to retrieve the farest pair of the number
currentDistance = bestDistance(N,A,coords,number, maxDistance);
if(currentDistance > maxDistance)
{
maxDistance = currentDistance;
}
}
}
distances[idx_num] = maxDistance;
}
}
This answer may be seen as incomplete, nevertheless, I am going to post it in order to close the question.
My problem was not the code, the kernel (or that algorithm), it was the machine. The above code is correct and works perfectly. After I tried my program in another machine it executed and computed the solution with no problem at all.
So, in brief, the problem was the OpenCL device or most likely the host libraries.

C++ using lists to solve sudoku

I'm trying to make a sudoku solver in c++. I want to keep an array from [9] by [9] (obviously). I'm now figuring out a way to keep track of the possible values. I thought about a list for every entry in the array. So the list has initially the numbers 1 to 9, and every iteration I would be able to get rid of some values.
Now my question is, can I assign one list to every entry in the 2D array, if so how? And else is there an other/better option?
I'm a starter programmer and this is basicly my first project in c++.
thanks in advance!
One simple solution is to use a set of one bit flags for each square, e.g.
uint16_t board[9][9]; // 16 x 1 bit flags for each square where 9 bits are used
// to represent possible values for the square
Then you can use bitwise operators to set/clear/test each bit, e.g.
board[i][j] |= (1 << n); // set bit n at board position i, j
board[i][j] &= ~(1 << n); // clear bit n at board position i, j
test = (board[i][j] & (1 << n)) != 0; // test bit n at board position i, j
Well you can create an array of sets by doing
std::array<std::set<int>,81> possibleValues;
for example. You can fill this array with all possibilities by writing
const auto allPossible = std::set<int>{ 0, 1, 2, 3, 4, 5, 6, 7, 8 };
std::fill( std::begin(possibleValues), std::end(possibleValues),
allPossible );
if you are using a modern C++11 compiler. This is how you can set/clear and test each entry:
possibleValues[x+9*y].insert( n ); // sets that n is possible at (x,y).
possibleValues[x+9*y].erase( n ); // clears the possibility of having n at (x,y).
possibleValues[x+9*y].count( n ) != 0 // tells, if n is possible at (x,y).
If performance is an issue, you might want to use bit operations rather than (relatively) heavyweight std::set operations. In this case use
std::array<short, 81> possibleValues;
std::fill( begin(possibleValues), end(possibleValues), (1<<10)-1 );
The value n is possible for the field (x,y), if and only if possibleValues[x+9*y] & (1<<n) != 0, where all indices start at 0 in this case.
You can always think of your sudoku as a 3D array making your 3D dimension to store the possible values and mainly:
// set "1" in cell's which index corespond to a possible value for the Sudoku cell
for (int x = 0; x < 9; x++)
for (int y = 0; y < 9; y++)
for (int i = 1; i < 10; i++)
arr[x][y][i] = 1;
and arr[x][y][0] contains the value of your Sudoku.
to remoove for exemple the value of "5" as a possibility for the cell [x][y] just change the value of arr[x][y][5] = 0

Finding smallest values of given vectors

How can I find the smallest value of each column in the given set of vectors efficiently ?
For example, consider the following program:
#include <iostream>
#include <vector>
#include <iterator>
#include <cstdlib>
using namespace std;
typedef vector<double> v_t;
int main(){
v_t v1,v2,v3;
for (int i = 1; i<10; i++){
v1.push_back(rand()%10);
v2.push_back(rand()%10);
v3.push_back(rand()%10);
}
copy(v1.begin(), v1.end(), ostream_iterator<double>(cout, " "));
cout << endl;
copy(v2.begin(), v2.end(), ostream_iterator<double>(cout, " "));
cout << endl;
copy(v3.begin(), v3.end(), ostream_iterator<double>(cout, " "));
cout << endl;
}
Let the output be
3 5 6 1 0 6 2 8 2
6 3 2 2 9 0 6 7 0
7 5 9 7 3 6 1 9 2
In this program I want to find the smallest value of every column (of the 3 given vectors) and put it into a vector. In this program I want to define a vector v_t vfinal that will have the values :
3 3 2 1 0 0 1 7 0
Is there an efficient way to do this ? I mention efficient because my program may have to find the smallest values among very large number of vectors. Thank you.
Update:
I'm trying to use something like this which I used in one of my previous programs
int count = std::inner_product(A, A+5, B, 0, std::plus<int>(), std::less<int>());
This counts the number of minimum elements between two arrays A and B. Wouldn't it be efficient enough if I could loop through and use similar kind of function to find the minimal values ? I'm not claiming it can be done or not. It's just an idea that may be improved upon but I don't know how.
You can use std::transform for this. The loops are still there, they're just hidden inside the algorithm. Each additional vector to process is a call to std::transform.
This does your example problem in two linear passes.
typedef std::vector<double> v_t;
int main()
{
v_t v1,v2,v3,vfinal(9); // note: vfinal sized to accept results
for (int i = 1; i < 10; ++i) {
v1.push_back(rand() % 10);
v2.push_back(rand() % 10);
v3.push_back(rand() % 10);
}
std::transform(v1.begin(), v1.end(), v2.begin(), vfinal.begin(), std::min<double>);
std::transform(v3.begin(), v3.end(), vfinal.begin(), vfinal.begin(), std::min<double>);
}
Note: this works in MSVC++ 2010. I had to provide a min functor for gcc 4.3.
I think that the lower bound of your problem is O(n*m), where n is the number of vectors and m the elements of each vector.
The trivial algorithm (comparing the elements at the same index of the different vectors) is as efficient as it can be, I think.
The easiest way to implement it would be to put all your vectors in some data structure (a simple C-like array, or maybe a vector of vectors).
The bst way to do this would be to use a vector of vectors, and just simple looping.
void find_mins(const std::vector<std::vector<int> >& inputs, std::vector<int>& outputs)
{
// Assuming that each vector is the same size, resize the output vector to
// change the size of the output vector to hold enough.
output.resize(inputs[0].size());
for (std::size_t i = 0; i < inputs.size(); ++i)
{
int min = inputs[i][0];
for (std::size_t j = 1; j < inputs[i].size(); ++j)
if (inputs[i][j] < min) min = inputs[i][j];
outputs[i] = min;
}
}
To find the smallest number in a vector, you simply have to examine each element in turn; there's no quicker way, at least from an algorithmic point-of-view.
In terms of practical performance, cache issues may affect you here. As has been mentioned in a comment, it will probably be more cache-efficient if you could store your vectors column-wise rather than row-wise. Alternatively, you may want to do all min searches in parallel, so as to minimise cache misses. i.e. rather than this:
foreach (col)
{
foreach (row)
{
x_min[col] = std::min(x_min[col], x[col][row]);
}
}
you should probably do this:
foreach (row)
{
foreach (col)
{
x_min[col] = std::min(x_min[col], x[col][row]);
}
}
Note that STL already provides a nice function to do this: min_element().

C++: function creation using array

Write a function which has:
input: array of pairs (unique id and weight) length of N, K =< N
output: K random unique ids (from input array)
Note: being called many times frequency of appearing of some Id in the output should be greater the more weight it has.
Example: id with weight of 5 should appear in the output 5 times more often than id with weight of 1. Also, the amount of memory allocated should be known at compile time, i.e. no additional memory should be allocated.
My question is: how to solve this task?
EDIT
thanks for responses everybody!
currently I can't understand how weight of pair affects frequency of appearance of pair in the output, can you give me more clear, "for dummy" explanation of how it works?
Assuming a good enough random number generator:
Sum the weights (total_weight)
Repeat K times:
Pick a number between 0 and total_weight (selection)
Find the first pair where the sum of all the weights from the beginning of the array to that pair is greater than or equal to selection
Write the first part of the pair to the output
You need enough storage to store the total weight.
Ok so you are given input as follows:
(3, 7)
(1, 2)
(2, 5)
(4, 1)
(5, 2)
And you want to pick a random number so that the weight of each id is reflected in the picking, i.e. pick a random number from the following list:
3 3 3 3 3 3 3 1 1 2 2 2 2 2 4 5 5
Initially, I created a temporary array but this can be done in memory as well, you can calculate the size of the list by summing all the weights up = X, in this example = 17
Pick a random number between [0, X-1], and calculate which which id should be returned by looping through the list, doing a cumulative addition on the weights. Say I have a random number 8
(3, 7) total = 7 which is < 8
(1, 2) total = 9 which is >= 8 **boom** 1 is your id!
Now since you need K random unique ids you can create a hashtable from initial array passed to you to work with. Once you find an id, remove it from the hash and proceed with algorithm. Edit Note that you create the hashmap initially only once! You algorithm will work on this instead of looking through the array. I did not put in in the top to keep the answer clear
As long as your random calculation is not using any extra memory secretly, you will need to store K random pickings, which are <= N and a copy of the original array so max space requirements at runtime are O(2*N)
Asymptotic runtime is :
O(n) : create copy of original array into hastable +
(
O(n) : calculate sum of weights +
O(1) : calculate random between range +
O(n) : cumulative totals
) * K random pickings
= O(n*k) overall
This is a good question :)
This solution works with non-integer weights and uses constant space (ie: space complexity = O(1)). It does, however modify the input array, but the only difference in the end is that the elements will be in a different order.
Add the weight of each input to the weight of the following input, starting from the bottom working your way up. Now each weight is actually the sum of that input's weight and all of the previous weights.
sum_weights = the sum of all of the weights, and n = N.
K times:
Choose a random number r in the range [0,sum_weights)
binary search the first n elements for the first slot where the (now summed) weight is greater than or equal to r, i.
Add input[i].id to output.
Subtract input[i-1].weight from input[i].weight (unless i == 0). Now subtract input[i].weight from to following (> i) input weights and also sum_weight.
Move input[i] to position [n-1] (sliding the intervening elements down one slot). This is the expensive part, as it's O(N) and we do it K times. You can skip this step on the last iteration.
subtract 1 from n
Fix back all of the weights from n-1 down to 1 by subtracting the preceding input's weight
Time complexity is O(K*N). The expensive part (of the time complexity) is shuffling the chosen elements. I suspect there's a clever way to avoid that, but haven't thought of anything yet.
Update
It's unclear what the question means by "output: K random unique Ids". The solution above assumes that this meant that the output ids are supposed to be unique/distinct, but if that's not the case then the problem is even simpler:
Add the weight of each input to the weight of the following input, starting from the bottom working your way up. Now each weight is actually the sum of that input's weight and all of the previous weights.
sum_weights = the sum of all of the weights, and n = N.
K times:
Choose a random number r in the range [0,sum_weights)
binary search the first n elements for the first slot where the (now summed) weight is greater than or equal to r, i.
Add input[i].id to output.
Fix back all of the weights from n-1 down to 1 by subtracting the preceding input's weight
Time complexity is O(K*log(N)).
My short answer: in no way.
Just because the problem definition is incorrect. As Axn brilliantly noticed:
There is a little bit of contradiction going on in the requirement. It states that K <= N. But as K approaches N, the frequency requirement will be contradicted by the Uniqueness requirement. Worst case, if K=N, all elements will be returned (i.e appear with same frequency), irrespective of their weight.
Anyway, when K is pretty small relative to N, calculated frequencies will be pretty close to theoretical values.
The task may be splitted on two subtasks:
Generate random numbers with a given distribution (specified by weights)
Generate unique random numbers
Generate random numbers with a given distribution
Calculate sum of weights (sumOfWeights)
Generate random number from the range [1; sumOfWeights]
Find an array element where the sum of weights from the beginning of the array is greater than or equal to the generated random number
Code
#include <iostream>
#include <cstdlib>
#include <ctime>
// 0 - id, 1 - weight
typedef unsigned Pair[2];
unsigned Random(Pair* i_set, unsigned* i_indexes, unsigned i_size)
{
unsigned sumOfWeights = 0;
for (unsigned i = 0; i < i_size; ++i)
{
const unsigned index = i_indexes[i];
sumOfWeights += i_set[index][2];
}
const unsigned random = rand() % sumOfWeights + 1;
sumOfWeights = 0;
unsigned i = 0;
for (; i < i_size; ++i)
{
const unsigned index = i_indexes[i];
sumOfWeights += i_set[index][3];
if (sumOfWeights >= random)
{
break;
}
}
return i;
}
Generate unique random numbers
Well known Durstenfeld-Fisher-Yates algorithm may be used for generation unique random numbers. See this great explanation.
It requires N bytes of space, so if N value is defined at compiled time, we are able to allocate necessary space at compile time.
Now, we have to combine these two algorithms. We just need to use our own Random() function instead of standard rand() in unique numbers generation algorithm.
Code
template<unsigned N, unsigned K>
void Generate(Pair (&i_set)[N], unsigned (&o_res)[K])
{
unsigned deck[N];
for (unsigned i = 0; i < N; ++i)
{
deck[i] = i;
}
unsigned max = N - 1;
for (unsigned i = 0; i < K; ++i)
{
const unsigned index = Random(i_set, deck, max + 1);
std::swap(deck[max], deck[index]);
o_res[i] = i_set[deck[max]][0];
--max;
}
}
Usage
int main()
{
srand((unsigned)time(0));
const unsigned c_N = 5; // N
const unsigned c_K = 2; // K
Pair input[c_N] = {{0, 5}, {1, 3}, {2, 2}, {3, 5}, {4, 4}}; // input array
unsigned result[c_K] = {};
const unsigned c_total = 1000000; // number of iterations
unsigned counts[c_N] = {0}; // frequency counters
for (unsigned i = 0; i < c_total; ++i)
{
Generate<c_N, c_K>(input, result);
for (unsigned j = 0; j < c_K; ++j)
{
++counts[result[j]];
}
}
unsigned sumOfWeights = 0;
for (unsigned i = 0; i < c_N; ++i)
{
sumOfWeights += input[i][1];
}
for (unsigned i = 0; i < c_N; ++i)
{
std::cout << (double)counts[i]/c_K/c_total // empirical frequency
<< " | "
<< (double)input[i][1]/sumOfWeights // expected frequency
<< std::endl;
}
return 0;
}
Output
N = 5, K = 2
Frequencies
Empiricical | Expected
0.253813 | 0.263158
0.16584 | 0.157895
0.113878 | 0.105263
0.253582 | 0.263158
0.212888 | 0.210526
Corner case when weights are actually ignored
N = 5, K = 5
Frequencies
Empiricical | Expected
0.2 | 0.263158
0.2 | 0.157895
0.2 | 0.105263
0.2 | 0.263158
0.2 | 0.210526
I do assume that the ids in the output must be unique. This makes this problem a specific instance of random sampling problems.
The first approach that I can think of solves this in O(N^2) time, using O(N) memory (The input array itself plus constant memory).
I Assume that the weights are possitive.
Let A be the array of pairs.
1) Set N to be A.length
2) calculate the sum of all weights W.
3) Loop K times
3.1) r = rand(0,W)
3.2) loop on A and find the first index i such that A[1].w + ...+ A[i].w <= r < A[1].w + ... + A[i+1].w
3.3) add A[i].id to output
3.4) A[i] = A[N-1] (or swap if the array contents should be preserved)
3.5) N = N - 1
3.6) W = W - A[i].w