Fast weighted random selection from very large set of values

Fast weighted random selection from very large set of values - c++

I'm currently working on a problem that requires the random selection of an element from a set. Each of the elements has a weight(selection probability) associated with it.
My problem is that for sets with a small number of elements say 5-10, the complexity (running time) of the solution I was is acceptable, however as the number of elements increases say for 1K or 10K etc, the running time becomes unacceptable.
My current strategy is:
Select random value X with range [0,1)
Iterate elements summing their weights until the sum is greater than X
The element which caused the sum to exceed X is chosen and returned
For large sets and a large number of selections this process begins to exhibit quadratic behavior, in short is there a faster way? a better algorithm perhaps?

You want to use the Walker algorithm. With N elements, there's a set-up
cost of O(N). However, the sampling cost is O(1). See
A. J. Walker, An Efficient Method for Generating
Discrete Random Variables and General Distributions, ACM TOMS 3, 253-256
(1977).
Knuth, TAOCP, Vol 2, Sec 3.4.1.A.
The RandomSelect class of a RandomLib
implements this algorithm.

Assuming that the element weights are fixed, you can work with precomputed sums. This is like working with the cumulative probability function directly, rather than the density function.
The lookup can then be implemented as a binary search, and hence be log(N) in the number of elements.
A binary search obviously requires random_access to the container of the weights.
Alternatively, use a std::map<> and the upper_bound() method.
#include <iostream>
#include <map>
#include <stdlib.h>
int main ()
{
std::map<double, char> cumulative;
typedef std::map<double, char>::iterator It;
cumulative[.20]='a';
cumulative[.30]='b';
cumulative[.40]='c';
cumulative[.80]='d';
cumulative[1.00]='e';
const int numTests = 10;
for(int i = 0;
i != numTests;
++i)
{
double linear = rand()*1.0/RAND_MAX;
std::cout << linear << "\t" << cumulative.upper_bound(linear)->second << std::endl;
}
return 0;
}

If you have a quick enough way to sample a random element uniformly, you can use rejection sampling; all you need to know is the maximum weight. It would work as follows: Suppose the maximum weight is M. Pick a number X uniformly in [0,1]. Sample elements repeatedly until you find one whose weight is at least M*X; choose this one.
Or, an approximate solution: pick 100 elements uniformly at random; choose one proportional to weight within this set.

Related

create pairs of vertices using adjacency list in linear time

I have n number of vertices numbered 1...n and want to pair every vertex with all other vertices. That would result in n*(n-1)/2 number of edges. Each vertex has some strength.The difference between the strength of two vertices is the weight of the edge.I need to get the total weight. Using two loops I can do this in O(n^2) time. But I want to reduce the time.I can use adjacency list and using that create a graph of n*(n-1)/2 edges but how will I create the adjacency list without using two loops. The input takes only the number of vertices and the strength of each vertex.
for(int i=0;i<n;i++)
for(int j=i+1;j<n;j++)
{
int w=abs((strength[i]-strength[j]));
sum+=w;
}
this is what i did earlier.I need a better way to do this.

If there are O(N*N) edges, then you can't list them all in linear time.
However, if indeed all you need is to compute the sum, here's a solution in O(N*log(N)). You can kind of improve the solution by using instead O(N) sorting algorithm, such as radix sort.
#include <algorithm>
#include <cstdint>
// ...
std::sort(strength, strength+n);
uint64_t sum = 0;
int64_t runSum = strength[0];
for(int i=1; i<n; i++) {
sum += int64_t(i)*strength[i] - runSum;
runSum += strength[i];
}
// Now "sum" contains the sum of weigths over all edges
To explain the algorithm:
The idea is to avoid summing over all edges explicitly (requiring O(N*N)), but rather to add sums of several weights at once. Consider the last vertex n-1 and the average A[n-1] = (strength[0] + strength[1] + ... + strength[n-2])/(n-1): obviously we could add (strength[n-1] - A[n-1]) * (n-1), i.e. n-1 weights at once, if the weights were all larger than strength[n-1], or all smaller than it. However, due to abs operation, we would have to add different amounts depending on whether the strength of the other vertex is larger or smaller than the strength of the current vertex. So one solution is to sort the strengths first, so to ensure that each next strength is greater or equal to the previous.

What is the Big-O of code that uses random number generators?

I want to fill the array 'a' with random values from 1 to N (no repeated values). Lets suppose Big-O of randInt(i, j) is O(1) and this function generates random values from i to j.
Examples of the output are:
{1,2,3,4,5} or {2,3,1,4,5} or {5,4,2,1,3} but not {1,2,1,3,4}
#include<set>
using std::set;
set<int> S;// space O(N) ?
int a[N]; // space O(N)
int i = 0; // space O(1)
do {
int val = randInt(1,N); //space O(1), time O(1) variable val is created many times ?
if (S.find(val) != S.end()) { //time O(log N)?
a[i] = val; // time O(1)
i++; // time O(1)
S.insert(val); // time O(log N) <-- we execute N times O(N log N)
}
} while(S.size() < N); // time O(1)
The While Loop will continue until we generate all the values from 1 to N.
My understanding is that Set sorts the values in logarithmic time log(N), and inserts in log(N).
Big-O = O(1) + O(X*log N) + O(N*log N) = O(X*log N)
Where X the more, the high probability to generate a number that is not in the Set.
time O(X log N)
space O(2N+1) => O(N), we reuse the space of val
Where ?? it is very hard to generate all different numbers each time randInt is executed, so at least I expect to execute N times.
Is the variable X created many times ?
What would be the a good value for X?

Suppose that the RNG is ideal. That is, repeated calls to randInt(1,N) generate an i.i.d. (independent and identically distributed) sequence of values uniformly distributed on {1,...,N}.
(Of course, in reality the RNG won't be ideal. But let's go with it since it makes the math easier.)
Average case
In the first iteration, a random value val1 is chosen which of course is not in the set S yet.
In the next iteration, another random value is chosen.
With probability (N-1)/N, it will be distinct from val1 and the inner conditional will be executed. In this case, call the chosen value val2.
Otherwise (with probability 1/N), the chosen value will be equal to val1. Retry.
How many iterations does it take on average until a valid (distinct from val1) val2 is chosen? Well, we have an independent sequence of attempts, each of which succeeds with probability (N-1)/N, and we want to know how many attempts it takes on average until the first success. This is a geometric distribution, and in general a geometric distribution with success probability p has mean 1/p. Thus, it takes N/(N-1) attempts on average to choose val2.
Similarly, it takes N/(N-2) attempts on average to choose val3 distinct from val1 and val2, and so on. Finally, the N-th value takes N/1 = N attempts on average.
In total the do loop will be executed
times on average. The sum is the N-th harmonic number which can be roughly approximated by ln(N). (There's a well-known better approximation which is a bit more complicated and involves the Euler-Mascheroni constant, but ln(N) is good enough for finding asymptotic complexity.)
So to an approximation, the average number of iterations will be N ln N.
What about the rest of the algorithm? Things like inserting N things into a set also take at most O(N log N) time, so can be disregarded. The big remaining thing is that each iteration you have to check if the chosen random value lies in S, which takes logarithmic time in the current size of S. So we have to compute
which, from numerical experiments, appears to be approximately equal to N/2 * (ln N)^2 for large N. (Consider asking for a proof of this on math.SE, perhaps.) EDIT: See this math.SE answer for a short informal proof, and the other answer to that question for a more formal proof.
So in conclusion, the total average complexity is Θ(N (ln N)^2).
Again, this is assuming that the RNG is ideal.
Worst case
Like xaxxon mentioned, it is in principle possible (though unlikely) that the algorithm will not terminate at all. Thus, the worst case complexity would be O(∞).

That's a very bad algorithm for achieving your goal.
Simply fill the array with the numbers 1 through N and then shuffle.
That's O(N)
https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle
To shuffle, pick an index between 0 and N-1 and swap it with index 0. Then pick an index between 1 and N-1 and swap it with index 1. All the way until the end of the list.
In terms of your specific question, it depends on the behavior of your random number generator. If it's truly random, it may never complete. If it's pseudorandom, it depends on the period of the generator. If it has a period of 5, then you'll never have any dupes.

It's catastrophically bad code with complex behaviour. Generating the first number is O(1), Then the second involves a binary search, so a log N, plus a rerun of the generator should the number be found. The chance of getting an new number is p = 1- i/N. So the average number of re-runs is the reciprocal, and gives you another factor of N. So O(N^2 log N).
The way to do it is to generate the numbers, then shuffle them. That's O(N).

Pick a matrix cell according to its probability

I have a 2D matrix of positive real values, stored as follow:
vector<vector<double>> matrix;
Each cell can have a value equal or greater to 0, and this value represents the possibility of the cell to be chosen. In particular, for example, a cell with a value equals to 3 has three times the probability to be chosen compared to a cell with value 1.
I need to select N cells of the matrix (0 <= N <= total number of cells) randomly, but according to their probability to be selected.
How can I do that?
The algorithm should be as fast as possible.

I describe two methods, A and B.
A works in time approximately N * number of cells, and uses space O(log number of cells). It is good when N is small.
B works in time approximately (number of cells + N) * O(log number of cells), and uses space O(number of cells). So, it is good when N is large (or even, 'medium') but uses a lot more memory, in practice it might be slower in some regimes for that reason.
Method A:
The first thing you need to do is normalize the entries. (It's not clear to me if you assume they are normalized or not.) That means, sum all the entries and divide by the sum. (This part is potentially slow, so it's better if you assume or require that it already happened.)
Then you sample like this:
Choose a random [i,j] entry of the matrix (by choosing i,j each uniformly randomly from the range of integers 0 to n-1).
Choose a uniformly random real number p in the range [0, 1].
Check if matrix[i][j] > p. If so, return the pair [i][j]. If not, go back to step 1.
Why does this work? The probability that we end at step 3 with any particular output, is equal to, the probability that [i][j] was selected (this is the same for each entry), times the probality that the number p was small enough. This is proportional to the value matrix[i][j], so the sampling is choosing each entry with the correct proportions. It's also possible that at step 3 we go back to the start -- does that bias things? Basically, no. The reason is, suppose we arbitrarily choose a number k and then consider the distribution of the algorithm, conditioned on stopping exactly after k rounds. Conditioned on the assumption that we stop at the k'th round, no matter what value k we choose, the distribution we sample has to be exactly right by the above argument. Since if we eliminate the case that p is too small, the other possibilities all have their proportions correct. Since the distribution is perfect for each value of k that we might condition on, and the overall distribution (not conditioned on k) is an average of the distributions for each value of k, the overall distribution is perfect also.
If you want to analyze the number of rounds that typically needed in a rigorous way, you can do it by analyzing the probability that we actually stop at step 3 for any particular round. Since the rounds are independent, this is the same for every round, and statistically, it means that the running time of the algorithm is poisson distributed. That means it is tightly concentrated around its mean, and we can determine the mean by knowing that probability.
The probability that we stop at step 3 can be determined by considering the conditional probability that we stop at step 3, given that we chose any particular entry [i][j]. By the formulas for conditional expectation, you get that
Pr[ stop at step 3 ] = sum_{i,j} ( 1/(n^2) * Matrix[i,j] )
Since we assumed the matrix is normalized, this sum reduces to just 1/n^2. So, the expected number of rounds is about n^2 (that is, n^2 up to a constant factor) no matter what the entries in the matrix are. You can't hope to do a lot better than that I think -- that's about the same amount of time it takes to just read all the entries of the matrix, and it's hard to sample from a distribution that you cannot even read all of.
Note: What I described is a way to correctly sample a single element -- to get N elements from one matrix, you can just repeat it N times.
Method B:
Basically you just want to compute a histogram and sample inversely from it, so that you know you get exactly the right distribution. Computing the histogram is expensive, but once you have it, getting samples is cheap and easy.
In C++ it might look like this:
// Make histogram
typedef unsigned int uint;
typedef std::pair<uint, uint> upair;
typedef std::map<double, upair> histogram_type;
histogram_type histogram;
double cumulative = 0.0f;
for (uint i = 0; i < Matrix.size(); ++i) {
for (uint j = 0; j < Matrix[i].size(); ++j) {
cumulative += Matrix[i][j];
histogram[cumulative] = std::make_pair(i,j);
}
}
std::vector<upair> result;
for (uint k = 0; k < N; ++k) {
// Do a sample (this should never repeat... if it does not find a lower bound you could also assert false quite reasonably since it means something is wrong with rand() implementation)
while(1) {
double p = cumulative * rand(); // Or, for best results use std::mt19937 or boost::mt19937 and sample a real in the range [0,1] here.
histogram_type::iterator it = histogram::lower_bound(p);
if (it != histogram.end()) {
result.push_back(it->second);
break;
}
}
}
return result;
Here the time to make the histogram is something like number of cells * O(log number of cells) since inserting into the map takes time O(log n). You need an ordered data structure in order to get cheap lookup N * O(log number of cells) later when you do repeated sampling. Possibly you could choose a more specialized data structure to go faster, but I think there's only limited room for improvement.
Edit: As #Bob__ points out in comments, in method (B) a written there is potentially going to be some error due to floating point round-off if the matrices are quite large, even using type double, at this line:
cumulative += Matrix[i][j];
The problem is that, if cumulative is much larger than Matrix[i][j] beyond what the floating point precision can handle then these each time this statement is executed you may observe significant errors which accumulate to introduce significant inaccuracy.
As he suggests, if that happens, the most straightforward way to fix it is to sort the values Matrix[i][j] first. You could even do this in the general implementation to be safe -- sorting these guys isn't going to take more time asymptotically than you already have anyways.

Find pair of elements in integer array such that abs(v[i]-v[j]) is minimized

Lets say we have int array with 5 elements: 1, 2, 3, 4, 5
What I need to do is to find minimum abs value of array's elements' subtraction:
We need to check like that
1-2 2-3 3-4 4-5
1-3 2-4 3-5
1-4 2-5
1-5
And find minimum abs value of these subtractions. We can find it with 2 fors. The question is, is there any algorithm for finding value with one and only for?

sort the list and subtract nearest two elements

The provably best performing solution is assymptotically linear O(n) up until constant factors.
This means that the time taken is proportional to the number of the elements in the array (which of course is the best we can do as we at least have to read every element of the array, which already takes O(n) time).
Here is one such O(n) solution (which also uses O(1) space if the list can be modified in-place):
int mindiff(const vector<int>& v)
{
IntRadixSort(v.begin(), v.end());
int best = MAX_INT;
for (int i = 0; i < v.size()-1; i++)
{
int diff = abs(v[i]-v[i+1]);
if (diff < best)
best = diff;
}
return best;
}
IntRadixSort is a linear time fixed-width integer sorting algorithm defined here:
http://en.wikipedia.org/wiki/Radix_sort
The concept is that you leverage the fixed-bitwidth nature of ints by paritioning them in a series of fixed passes on the bit positions. ie partition them on the hi bit (32nd), then on the next highest (31st), then on the next (30th), and so on - which only takes linear time.

The problem is equivalent to sorting. Any sorting algorithm could be used, and at the end, return the difference between the nearest elements. A final pass over the data could be used to find that difference, or it could be maintained during the sort. Before the data is sorted the min difference between adjacent elements will be an upper bound.
So to do it without two loops, use a sorting algorithm that does not have two loops. In a way it feels like semantics, but recursive sorting algorithms will do it with only one loop. If this issue is the n(n+1)/2 subtractions required by the simple two loop case, you can use an O(n log n) algorithm.

No, unless you know the list is sorted, you need two

Its simple Iterate in a for loop
keep 2 variable "minpos and maxpos " and " minneg" and "maxneg"
check for the sign of the value you encounter and store maximum positive in maxpos
and minimum +ve number in "minpos" do the same by checking in if case for number
less than zero. Now take the difference of maxpos-minpos in one variable and
maxneg and minneg in one variable and print the larger of the two . You will get
desired.
I believe you definitely know how to find max and min in one for loop
correction :- The above one is to find max difference in case of minimum you need to
take max and second max instead of max and min :)

This might be help you:
end=4;
subtractmin;
m=0;
for(i=1;i<end;i++){
if(abs(a[m]-a[i+m])<subtractmin)
subtractmin=abs(a[m]-a[i+m];}
if(m<4){
m=m+1
end=end-1;
i=m+2;
}}

What are practical uses for STL's 'partial_sum'?

What/where are the practical uses of the partial_sum algorithm in STL?
What are some other interesting/non-trivial examples or use-cases?

I used it to reduce memory usage of a simple mark-sweep garbage collector in my toy lambda calculus interpreter.
The GC pool is an array of objects of identical size. The goal is to eliminate objects that aren't linked to other objects, and condense the remaining objects into the beginning of the array. Since the objects are moved in memory, each link needs to be updated. This necessitates an object remapping table.
partial_sum allows the table to be stored in compressed format (as little as one bit per object) until the sweep is complete and memory has been freed. Since the objects are small, this significantly reduces memory use.
Recursively mark used objects and populate the Boolean array.
Use remove_if to condense the marked objects to the beginning of the pool.
Use partial_sum over the Boolean values to generate a table of pointers/indexes into the new pool.
This works because the Nth marked object has N preceding 1's in the array and acquires pool index N.
Sweep over the pool again and replace each link using the remap table.
It's especially friendly to the data cache to put the remap table in the just-freed, thus still hot, memory.

One thing to note about partial sum is that it is the operation that undoes adjacent difference much like - undoes +. Or better yet if you remember calculus the way differentiation undoes integration. Better because adjacent difference is essentially differentiation and partial sum is integration.
Let's say you have simulation of a car and at each time step you need to know the position, velocity, and acceleration. You only need to store one of those values as you can compute the other two. Say you store the position at each time step you can take the adjacent difference of the position to give the velocity and the adjacent difference of the velocity to give the acceleration. Alternatively, if you store the acceleration you can take the partial sum to give the velocity and the partial sum of the velocity gives the position.
Partial sum is one of those functions that doesn't come up too often for most people but is enormously useful when you find the right situation. A lot like calculus.

Last time I (would have) used it is when converting a discrete probability distribution (an array of p(X = k)) into a cumulative distribution (an array of p(X <= k)). To select once from the distribution, you can pick a number from [0-1) randomly, then binary search into the cumulative distribution.
That code wasn't in C++, though, so I did the partial sum myself.

You can use it to generate a monotonically increasing sequence of numbers. For example, the following generates a vector containing the numbers 1 through 42:
std::vector<int> v(42, 1);
std::partial_sum(v.begin(), v.end(), v.begin());
Is this an everyday use case? Probably not, though I've found it useful on several occasions.
You can also use std::partial_sum to generate a list of factorials. (This is even less useful, though, since the number of factorials that can be represented by a typical integer data type is quite limited. It is fun, though :-D)
std::vector<int> v(10, 1);
std::partial_sum(v.begin(), v.end(), v.begin());
std::partial_sum(v.begin(), v.end(), v.begin(), std::multiplies<int>());

Personal Use Case: Roulette-Wheel-Selection
I'm using partial_sum in a roulette-wheel-selection algorithm (link text). This algorithm choses randomly elements from a container with a probability which is linear to some value given beforehands.
Because all my elements to choose from bringing a not-necessarily normalized value, I use the partial_sum algorithm for constructing something like a "roulette-wheel", because I sum up all the elements. Then I chose a random variable in this range (the last partial_sum is the sum of all) and use stl::lower_bound for searching "the wheel" where my random search landed. The element returned by the lower_bound algorithm is the chosen one.
Besides the advantage of clear and expressive code with the use of partial_sum, I could also gain some speed when experimenting with the GCC parallel mode which brings parallelized versions for some algorithms and one of them is the partial_sum (link text).
Another use I know of: One of the most important algorithmic primitives in parallel processing (but maybe a little bit away from STL)
If you're interested in heavy optimized algorithms which are using partial_sum (in this case maybe more results under the synonyms "scan" or "prefix_sum"), than go to the parallel algorithms community. They need it all the time. You won't find a parallel sorting algorithm based on quicksort or mergesort without using it. This operation is one of the most important parallel primitives used. I think it is most commonly used for calculating offsets in dynamic algorithms. Think of a partition step in quicksort, which is split and fed to the parallel threads. You don't know the number of elements in each slot of the partition before calculating it. So you need some offsets for all the threads for later access.
Maybe you will find more informatin in the now-hot topic of GPU processing. One short article regarding Nvidia's CUDA and the scan-primitive with a few application examples you will find in Chapter 39. Parallel Prefix Sum (Scan) with CUDA.

Personal Use Case: intermediate step in counting sort from CLRS:
COUNTING_SORT (A, B, k)
for i ← 1 to k do
c[i] ← 0
for j ← 1 to n do
c[A[j]] ← c[A[j]] + 1
//c[i] now contains the number of elements equal to i
// std::partial_sum here
for i ← 2 to k do
c[i] ← c[i] + c[i-1]
// c[i] now contains the number of elements ≤ i
for j ← n downto 1 do
B[c[A[i]]] ← A[j]
c[A[i]] ← c[A[j]] - 1

You could build a "moving sum" (precursor to a moving average):
template <class T>
void moving_sum (const vector<T>& in, int num, vector<T>& out)
{
// cummulative sum
partial_sum (in.begin(), in.end(), out.begin());
// shift and subtract
int j;
for (int i = out.size() - 1; i >= 0; i--) {
j = i - num;
if (j >= 0)
out[i] -= out[j];
}
}
And then call it with:
vector<double> v(10);
// fill in v
vector<double> v2 (v.size());
moving_sum (v, 3, v2);

You know, I actually did use partial_sum() once... It was this interesting little problem that I was asked on a job interview. I enjoyed it so much, I went home and coded it up.
The problem was: Given a sequential sequence of integers, find the shortest sub-sequence with the highest value. E.g. Given:
Value: -1 2 3 -1 4 -2 -4 5
Index: 0 1 2 3 4 5 6 7
We would find the subsequence [1,4]
Now the obvious solution is to run with 3 for loops, iterating over all possible starts & ends, and adding up the value of each possible subsequence in turn. Inefficient, but quick to code up and hard to make mistakes. (Especially when the third for loop is just accumulate(start,end,0).)
The correct solution involves a divide-and-conquer / bottom up approach. E.g. Divide the problem space in half, and for each half compute the largest subsequence contained within that section, the largest subsequence including the starting number, the largest subsequence including the ending number, and the entire section's subsequence. Armed with this data we can then combine the two halves together without any further evaluation of either one. Obviously the data for each half can be computed by further breaking each half into halves (quarters), each quarter into halves (eighths), and so on until we have trivial singleton cases. It's all quite efficient.
But all that aside, there's a third (somewhat less efficient) option that I wanted to explore. It's similar to the 3-for-loop case, only we add the adjacent numbers to avoid so much work. The idea is that there's no need to add a+b, a+b+c, and a+b+c+d when we can add t1=a+b, t2=t1+c, and t3=t2+d. It's a space/computation tradeoff thing. It works by transforming the sequence:
Index: 0 1 2 3 4
FROM: 1 2 3 4 5
TO: 1 3 6 10 15
Thereby giving us all possible substrings starting at index=0 and ending at indexes=0,1,2,3,4.
Then we iterate over this set subtracting the successive possible "start" points...
FROM: 1 3 6 10 15
TO: - 2 5 9 14
TO: - - 3 7 12
TO: - - - 4 9
TO: - - - - 5
Thereby giving us the values (sums) of all possible subsequences.
We can find the maximum value of each iteration via max_element().
The first step is most easily accomplished via partial_sum().
The remaining steps via a for loop and transform(data+i,data+size,data+i,bind2nd(minus<TYPE>(),data[i-1])).
Clearly O(N^2). But still interesting and fun...

Partial sums are often useful in parallel algorithms. Consider the code
for (int i=0; N>i; ++i) {
sum += x[i];
do_something(sum);
}
If you want to parallelise this code, you need to know the partial sums. I am using GNUs parallel version of partial_sum for something very similar.

I often use partial sum not to sum but to calculate the current value in the sequence depending on the previous.
For example, if you integrate a function. Each new step is a previous step, vt += dvdt or vt = integrate_step(dvdt, t_prev, t_prev+dt);.

In nonparametric Bayesian methods there is a Metropolis-Hastings step (per observation) that determines to sample a new or an existing cluster. If an existing cluster has to be sampled this needs to be done with different weights. These weighted likelihoods are simulated in the following example code.
#include <random>
#include <iostream>
#include <algorithm>
int main() {
std::default_random_engine generator(std::random_device{}());
std::uniform_real_distribution<double> distribution(0.0,1.0);
int K = 8;
std::vector<double> weighted_likelihood(K);
for (int i = 0; i < K; ++i) {
weighted_likelihood[i] = i*10;
}
std::cout << "Weighted likelihood: ";
for (auto i: weighted_likelihood) std::cout << i << ' ';
std::cout << std::endl;
std::vector<double> cumsum_likelihood(K);
std::partial_sum(weighted_likelihood.begin(), weighted_likelihood.end(), cumsum_likelihood.begin());
std::cout << "Cumulative sum of weighted likelihood: ";
for (auto i: cumsum_likelihood) std::cout << i << ' ';
std::cout << std::endl;
std::vector<int> frequency(K);
int N = 280000;
for (int i = 0; i < N; ++i) {
double pick = distribution(generator) * cumsum_likelihood.back();
auto lower = std::lower_bound(cumsum_likelihood.begin(), cumsum_likelihood.end(), pick);
int index = std::distance(cumsum_likelihood.begin(), lower);
frequency[index]++;
}
std::cout << "Frequencies: ";
for (auto i: frequency) std::cout << i << ' ';
std::cout << std::endl;
}
Note that this is not different from the answer by https://stackoverflow.com/users/13005/steve-jessop. It's added to give a bit more context about a particular situation (nonparametric Bayesian mehods, e.g. the algorithms by Neal using the Dirichlet process as prior) and the actual code which uses partial_sum in combination with lower_bound.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js