modify the squared sum [ceres-solver] - c++

I am trying to modify the default behavior of ceres which is computing the squared sum of residuals as cost function. I want it to compute only a sum ( the residuals are already computed in a manner that they can only be positive)
according to the documentation I should use ConditionedCostFunction
this is what I have done:
I define the conditioner that takes 1 residuals and 1 parameter
struct Conditioners : ceres::CostFunction
{
public:
Conditioners()
{
set_num_residuals(1);
mutable_parameter_block_sizes()->push_back(1);
}
~Conditioners()
{}
template<typename T>
T operator() (T x)
{
return T(x * x);
}
bool Evaluate(double const* const* parameters, double* residuals, double** jacobians) const
{
return true;
}
};
I put conditioners inside a vector
std::vector<ceres::CostFunction*> conditioners;
for(int i = 0; i < 1; i++)
conditioners.push_back(new Conditioners());
ceres::ConditionedCostFunction* ccf =
new ceres::ConditionedCostFunction(cost_function, conditioners, ceres::TAKE_OWNERSHIP);
problem.AddResidualBlock(ccf, NULL, &x);
it compiles and everything. But it does not solve the problem. it does not even start. it says :
Ceres Solver Report: Iterations: 0, Initial cost: 4.512500e+01, Final cost: 4.512500e+01, Termination: CONVERGENCE
x : 0.5 -> 0.5
instead of :
iter cost cost_change |gradient| |step| tr_ratio tr_radius ls_iter iter_time total_time
0 4.512500e+01 0.00e+00 9.50e+00 0.00e+00 0.00e+00 1.00e+04 0 2.99e-04 1.04e-03
1 4.511598e-07 4.51e+01 9.50e-04 9.50e+00 1.00e+00 3.00e+04 1 3.84e-04 9.72e-03
2 5.012552e-16 4.51e-07 3.17e-08 9.50e-04 1.00e+00 9.00e+04 1 2.98e-05 9.92e-03
Ceres Solver Report: Iterations: 2, Initial cost: 4.512500e+01, Final cost: 5.012552e-16, Termination: CONVERGENCE
x : 0.5 -> 10
(if you want to try it yourself, this example modifies the helloword example)
Do you have any direction on what went wrong ?? (ceres report was not more specific)

I found the solution, which is :
struct Conditioners : ceres::CostFunction
{
public:
Conditioners()
{
set_num_residuals(1);
mutable_parameter_block_sizes()->push_back(1);
}
~Conditioners()
{}
template<typename T>
T operator() (T x)
{
return T(x * x);
}
bool Evaluate(double const* const* parameters, double* residuals, double** jacobians) const
{
residuals[0] = parameters[0][0] * parameters[0][0]
if (jacobians)
jacobians[0][0] = 2.0 * parameters[0][0]
return true;
}
};
My error was to thought I only had to re implement the () operator and that ceres will found out automatically the jacobian. It doesnot.

Related

Drake: Integrate Mass Matrix and Bias Term in Optimization Problem

I am trying to implement Non Linear MPC for a 7-DOF manipulator in drake. To do this, in my constraints, I need to have dynamic parameters like the Mass matrix M(q) and the bias term C(q,q_dot)*q_dot, but those depend on the decision variables q, q_dot.
I tried the following
// finalize plant
// create builder, diagram, context, plant context
...
// formulate optimazation problem
drake::solvers::MathematicalProgram prog;
// create decision variables
...
std::vector<drake::solvers::VectorXDecisionVariable> q_v;
std::vector<drake::solvers::VectorXDecisionVariable> q_ddot;
for (int i = 0; i < H; i++) {
q_v.push_back(prog.NewContinuousVariables<14>(state_var_name));
q_ddot.push_back(prog.NewContinuousVariables<7>(input_var_name));
}
// add cost
...
// add constraints
...
for (int i = 0; i < H; i++) {
plant.SetPositionsAndVelocities(*plant_context, q_v[i]);
plant.CalcMassMatrix(*plant_context, M);
plant.CalcBiasTerm(*plant_context, C_q_dot);
}
...
for (int i = 0; i < H; i++) {
prog.AddConstraint( M * q_ddot[i] + C_q_dot + G >= lb );
prog.AddConstraint( M * q_ddot[i] + C_q_dot + G <= ub );
}
// solve prog
...
The above code will not work, because plant.SetPositionsAndVelocities(.) doesn't accept symbolic variables.
Is there any way to integrate M,C in my ocp constraints ?
I think you want to impose the following nonlinear nonconvex constraint
lb <= M * qddot + C(q, v) + g(q) <= ub
This constraint is non-convex. We will need to solve it through nonlinear optimization, and evaluate the constraint in every iteration of the nonlinear optimization. We can't do this evaluation using symbolic computation (it would be horribly slow with symbolic computation).
So you will need a constraint evaluator, something like this
// This constraint takes [q;v;vdot] and evaluate
// M * vdot + C(q, v) + g(q)
class MyConstraint : public solvers::Constraint {
public:
MyConstraint(const MultibodyPlant<AutoDiffXd>& plant, systems::Context<AutoDiffXd>* context, const Eigen::Ref<const Eigen::VectorXd>& lb, const Eigen::Ref<const Eigen::VectorXd>& ub) : solvers::Constraint(plant.num_velocitiex(), plant.num_positions() + 2 * plant.num_velocities(), lb, ub), plant_{plant}, context_{context} {
...
}
private:
void DoEval(const Eigen::Ref<const AutoDiffVecXd>& x, AutoDiffVecXd* y) const {
...
}
MultibodyPlant<AutoDiffXd> plant_;
systems::Context<AutoDiffXd>* context_;
};
int main() {
...
// Construct the constraint and add it to every time instances
std::vector<std::unique_ptr<systems::Context<AutoDiffXd>>> plant_contexts;
for (int i = 0; i < H; ++i) {
plant_contexts.push_back(plant.CreateDefaultContext());
prog.AddConstraint(std::make_shared<MyConstraint>(plant, plant_context[i], lb, ub), {q_v[i], qddot[i]});
}
}
You could refer to the class CentroidalMomentumConstraint on how to construct your own MyConstraint class.

pass parameters of double but get Jet<double,6>when using ceres solver

I'm a new learner to Ceres Solver, when adding the residualblock using
problem.AddResidualBlock( new ceres::AutoDiffCostFunction<Opt, 1, 6> (new Opt(Pts[i][j].x, Pts[i][j].y, Pts[i][j].z, Ns[i].at<double>(0, 0), Ns[i].at<double>(1, 0), Ns[i].at<double>(2, 0), Ds[i], weights[i]) ),
NULL,
param );
where param is double[6];
struct Opt
{
const double ptX, ptY, ptZ, nsX, nsY, nsZ, ds, w;
Opt( double ptx, double pty, double ptz, double nsx, double nsy, double nsz, double ds1, double w1):
ptX(ptx), ptY(pty), ptZ(ptz), nsX(nsx), nsY(nsy), nsZ(nsz), ds(ds1), w(w1) {}
template<typename T>
bool operator()(const T* const x, T* residual) const
{
Mat R(3, 3, CV_64F), r(1, 3, CV_64F);
Mat inverse(3,3, CV_64F);
T newP[3];
T xyz[3];
for (int i = 0; i < 3; i++){
r.at<T>(i) = T(x[i]);
cout<<x[i]<<endl;
}
Rodrigues(r, R);
inverse = R.inv();
newP[0]=T(ptX)-x[3];
newP[1]=T(ptY)-x[4];
newP[2]=T(ptZ)-x[5];
xyz[0]= inverse.at<T>(0, 0)*newP[0] + inverse.at<T>(0, 1)*newP[1] + inverse.at<T>(0, 2)*newP[2];
xyz[1] = inverse.at<T>(1, 0)*newP[0] + inverse.at<T>(1, 1)*newP[1] + inverse.at<T>(1, 2)*newP[2];
xyz[2] = inverse.at<T>(2, 0)*newP[0] + inverse.at<T>(2, 1)*newP[1] + inverse.at<T>(2, 2)*newP[2];
T ds1 = T(nsX) * xyz[0] + T(nsY) * xyz[1] + T(nsZ) * xyz[2];
residual[0] = (ds1 - T(ds)) * T(w);
}
};
but when I output the x[0], I got this:
[-1.40926 ; 1, 0, 0, 0, 0, 0]
after I change the type of the x to double
I got this error :
note: no known conversion for argument 1 from ‘const ceres::Jet<double, 6>* const’ to ‘const double*’
in
bool operator()(const double* const x, double* residual) const
what's wrong with my codes?
Thanks a lot!
I am guessing you are using cv::Mat.
The reason the functor is templated is because Ceres evaluates it using doubles when it needs just the residuals, and evaluates with ceres:Jet objects when it needs to compute the Jacobian. So your attempt to fill r as
for (int i = 0; i < 3; i++){
r.at<T>(i) = T(x[i]);
cout<<x[i]<<endl;
}
are trying to convert a Jet into a double. Which is what the compiler is correctly complaining about.
you can re-write your code as (I have not compiled it, so there maybe a minor typo or two).
template<typename T>
bool operator()(const T* const x, T* residual) const {
const T inverse_rotation[3] = {-x[0], -x[1], -x[3]};
const T newP[3] = {ptX - x[3], ptY - x[4]. ptZ - x[5]};
T xyz[3];
ceres::AngleAxisRotatePoint(inverse_rotation, newP, xyz);
const T ds1 = nsX * xyz[0] + nsY * xyz[1] + nsZ * xyz[2];
residual[0] = (ds1 - ds) * w;
return true;
}
The automatic derivatives (AutoDiff) needs a templated cost function to keep track of the operations.
Please take a look at the ceres documentation (http://ceres-solver.org/nnls_modeling.html#autodiffcostfunction). There are a lot of nice examples too. I used them as starting point for my first ceres experiments.
I'm not sure if you can use ceres cost functions with OpenCV functions. In most cases Eigen is used to make the cost function.
Ceres comes with a lot of "ready-to-use" components for cost functions like yours.

unexpected results with word2vec algorithm

I implemented word2vec in c++.
I found the original syntax to be unclear, so I figured I'd re-implement it, using all the benefits of c++ (std::map, std::vector, etc)
This is the method that actually gets called every time a sample is trained (l1 denotes the index of the first word, l2 the index of the second word, label indicates whether it is a positive or negative sample, and neu1e acts as the accumulator for the gradient)
void train(int l1, int l2, double label, std::vector<double>& neu1e)
{
// Calculate the dot-product between the input words weights (in
// syn0) and the output word's weights (in syn1neg).
auto f = 0.0;
for (int c = 0; c < m__numberOfFeatures; c++)
f += syn0[l1][c] * syn1neg[l2][c];
// This block does two things:
// 1. Calculates the output of the network for this training
// pair, using the expTable to evaluate the output layer
// activation function.
// 2. Calculate the error at the output, stored in 'g', by
// subtracting the network output from the desired output,
// and finally multiply this by the learning rate.
auto z = 1.0 / (1.0 + exp(-f));
auto g = m_learningRate * (label - z);
// Multiply the error by the output layer weights.
// (I think this is the gradient calculation?)
// Accumulate these gradients over all of the negative samples.
for (int c = 0; c < m__numberOfFeatures; c++)
neu1e[c] += (g * syn1neg[l2][c]);
// Update the output layer weights by multiplying the output error
// by the hidden layer weights.
for (int c = 0; c < m__numberOfFeatures; c++)
syn1neg[l2][c] += g * syn0[l1][c];
}
This method gets called by
void train(const std::string& s0, const std::string& s1, bool isPositive, std::vector<double>& neu1e)
{
auto l1 = m_wordIDs.find(s0) != m_wordIDs.end() ? m_wordIDs[s0] : -1;
auto l2 = m_wordIDs.find(s1) != m_wordIDs.end() ? m_wordIDs[s1] : -1;
if(l1 == -1 || l2 == -1)
return;
train(l1, l2, isPositive ? 1 : 0, neu1e);
}
which in turn gets called by the main training method.
Full code can be found at
https://github.com/jorisschellekens/ml/tree/master/word2vec
With complete example at
https://github.com/jorisschellekens/ml/blob/master/main/example_8.hpp
When I run this algorithm, the top 10 words 'closest' to father are:
father
Khan
Shah
forgetful
Miami
rash
symptoms
Funeral
Indianapolis
impressed
This the method to calculate the nearest words:
std::vector<std::string> nearest(const std::string& s, int k) const
{
// calculate distance
std::vector<std::tuple<std::string, double>> tmp;
for(auto &t : m_unigramFrequency)
{
tmp.push_back(std::make_tuple(t.first, distance(t.first, s)));
}
// sort
std::sort(tmp.begin(), tmp.end(), [](const std::tuple<std::string, double>& t0, const std::tuple<std::string, double>& t1)
{
return std::get<1>(t0) < std::get<1>(t1);
});
// take top k
std::vector<std::string> out;
for(int i=0; i<k; i++)
{
out.push_back(std::get<0>(tmp[tmp.size() - 1 - i]));
}
// return
return out;
}
Which seems weird.
Is something wrong with my algorithm?
Are you sure, that you get "nearest" words (not farest)?
...
// take top k
std::vector<std::string> out;
for(int i=0; i<k; i++)
{
out.push_back(std::get<0>(tmp[tmp.size() - 1 - i]));
}
...

Limit values of struct member [duplicate]

This question already has answers here:
Fastest way to clamp a real (fixed/floating point) value?
(14 answers)
Closed 7 years ago.
I want to create a simple struct that stores the RGB-values of a color. r, g and b are supposed to be double numbers in [0,1].
struct Color
{
Color(double x): r{x}, g{x}, b{x} {
if (r < 0.0) r = 0.0;
if (r > 1.0) r = 1.0;
if (g < 0.0) g = 0.0;
if (g > 1.0) g = 1.0;
if (b < 0.0) b = 0.0;
if (b > 1.0) b = 1.0;
}
}
Is there a better way than using those if statements?
Just write a function to clamp:
double clamp(double val, double left = 0.0, double right = 1.0) {
return std::min(std::max(val, left), right);
}
And use that in your constructor:
Color(double x)
: r{clamp(x)}
, g{clamp(x)}
, b{clamp(x)}
{ }
You, can can use min and max, ideally combining them into a clamp function:
template <class T>
T clamp(T val, T min, T max)
{
return std::min(max, std::max(min, val));
}
struct Color
{
Color(double x) : r{clamp(x, 0., 1.)}, g{clamp(x, 0., 1.)}, b{clamp(x, 0., 1.)}
{}
};
For a first pass iteration, we have min/max functions we can and should use:
struct Color
{
explicit Color(double x): r{x}, g{x}, b{x}
{
r = std::max(r, 0.0);
r = std::min(r, 1.0);
g = std::max(g, 0.0);
g = std::min(g, 1.0);
b = std::max(b, 0.0);
b = std::min(b, 1.0);
}
double r, g, b;
};
I'd also suggest making that constructor explicit, as it's rather confusing for a scalar to implicitly convert to a Color.
The reason this is arguably an upgrade even with roughly the same amount of code and arguably not the biggest improvement in readability is because, while optimizing compilers might emit faster branchless code here, min and max can often guarantee an efficient implementation. You're also expressing what you're doing in a slightly more direct way.
There is some truth to this somewhat counter-intuitive idea that writing higher level code helps you achieve efficiency, if only for the reason that the low-level logic used to implement the high-level function is more likely to be efficient than what people would repeatedly write otherwise in their more casual, daily kind of code. It also helps direct your codebase towards more central targets for optimization.
As a second pass, this may not improve things for your particular use cases, but in general I've found it's useful to represent color and vector components using an array to allow you to access them with loops. This is because if you start doing somewhat complex things with colors like blending them together, the logic for each color component is non-trivial but identical for all components, so you don't want to end up writing such code three times all the time or always be forced into writing the per-component logic in a separate function or anything like that.
So we might do this:
class Color
{
public:
explicit Color(double x)
{
for (int j=0; j < 3; ++j)
{
rgb[j] = x;
rgb[j] = std::max(rgb[j], 0.0);
rgb[j] = std::min(rgb[j], 1.0);
}
}
// Bounds-checking assertions in these would also be a nice idea.
double& operator[](int n) {return rgb[n]};
double operator[](int n) const {return rgb[n]};
double& red() {return rgb[0];}
double red() const {return rgb[0];}
double& green() {return rgb[1];}
double green() const {return rgb[1];}
double& blue() {return rgb[2];}
double blue() const {return rgb[2];}
// Somewhat excess fluff, but such methods can be useful when
// interacting with a low-level C-style API (OpenGL, e.g.) as
// opposed to using &color.red() or &color[0].
double* data() {return rgb;}
const double* data() const {return rgb;}
private:
double rgb[3];
};
Finally, as others have mentioned, this is where a function to clamp values to a range is useful, so as a final pass:
template <class T>
T clamp(T val, T low, T high)
{
assert(low <= high);
return std::max(std::min(val, high), low);
}
// New constructor using clamp:
explicit Color(double x)
{
for (int j=0; j < 3; ++j)
rgb[j] = clamp(x, 0.0, 1.0);
}

How to generate Zipf distributed numbers efficiently?

I'm currently benchmarking some data structures in C++ and I want to test them when working on Zipf-distributed numbers.
I'm using the generator provided on this site: http://www.cse.usf.edu/~christen/tools/toolpage.html
I adapted the implementation to use a Mersenne Twister generator.
It works well but it is really slow. In my case, the range can be big (about a million) and the number of random numbers generate can be several millions.
The alpha parameter does not change over time, it is fixed.
I tried to precaculate all the sum_prob. It's much faster, but still slows on big range.
Is there a faster way to generate Zipf distributed numbers ? Even something less precise will be welcome.
Thanks
The pre-calculation alone does not help so much. But as it's obvious the sum_prob is accumulative and has ascending order. So if we use a binary-search to find the zipf_value we would decrease the order of generating a Zipf distributed number from O(n) to O(log(n)). Which is so much improvement in efficiency.Here it is, just replace the zipf() function in genzipf.c with following one:
int zipf(double alpha, int n)
{
static int first = TRUE; // Static first time flag
static double c = 0; // Normalization constant
static double *sum_probs; // Pre-calculated sum of probabilities
double z; // Uniform random number (0 < z < 1)
int zipf_value; // Computed exponential value to be returned
int i; // Loop counter
int low, high, mid; // Binary-search bounds
// Compute normalization constant on first call only
if (first == TRUE)
{
for (i=1; i<=n; i++)
c = c + (1.0 / pow((double) i, alpha));
c = 1.0 / c;
sum_probs = malloc((n+1)*sizeof(*sum_probs));
sum_probs[0] = 0;
for (i=1; i<=n; i++) {
sum_probs[i] = sum_probs[i-1] + c / pow((double) i, alpha);
}
first = FALSE;
}
// Pull a uniform random number (0 < z < 1)
do
{
z = rand_val(0);
}
while ((z == 0) || (z == 1));
// Map z to the value
low = 1, high = n, mid;
do {
mid = floor((low+high)/2);
if (sum_probs[mid] >= z && sum_probs[mid-1] < z) {
zipf_value = mid;
break;
} else if (sum_probs[mid] >= z) {
high = mid-1;
} else {
low = mid+1;
}
} while (low <= high);
// Assert that zipf_value is between 1 and N
assert((zipf_value >=1) && (zipf_value <= n));
return(zipf_value);
}
The only C++11 Zipf random generator I could find calculated the probabilities explicitly and used std::discrete_distribution. This works fine for small ranges, but is not useful if you need to generate Zipf values with a very wide range (for database testing, in my case) since it will exhaust memory. So, I implemented the below-mentioned algorithm in C++.
I have not rigorously tested this code, and some optimizations are probably possible, but it only requires constant space and seems to work well.
#include <algorithm>
#include <cmath>
#include <random>
/** Zipf-like random distribution.
*
* "Rejection-inversion to generate variates from monotone discrete
* distributions", Wolfgang Hörmann and Gerhard Derflinger
* ACM TOMACS 6.3 (1996): 169-184
*/
template<class IntType = unsigned long, class RealType = double>
class zipf_distribution
{
public:
typedef RealType input_type;
typedef IntType result_type;
static_assert(std::numeric_limits<IntType>::is_integer, "");
static_assert(!std::numeric_limits<RealType>::is_integer, "");
zipf_distribution(const IntType n=std::numeric_limits<IntType>::max(),
const RealType q=1.0)
: n(n)
, q(q)
, H_x1(H(1.5) - 1.0)
, H_n(H(n + 0.5))
, dist(H_x1, H_n)
{}
IntType operator()(std::mt19937& rng)
{
while (true) {
const RealType u = dist(rng);
const RealType x = H_inv(u);
const IntType k = clamp<IntType>(std::round(x), 1, n);
if (u >= H(k + 0.5) - h(k)) {
return k;
}
}
}
private:
/** Clamp x to [min, max]. */
template<typename T>
static constexpr T clamp(const T x, const T min, const T max)
{
return std::max(min, std::min(max, x));
}
/** exp(x) - 1 / x */
static double
expxm1bx(const double x)
{
return (std::abs(x) > epsilon)
? std::expm1(x) / x
: (1.0 + x/2.0 * (1.0 + x/3.0 * (1.0 + x/4.0)));
}
/** H(x) = log(x) if q == 1, (x^(1-q) - 1)/(1 - q) otherwise.
* H(x) is an integral of h(x).
*
* Note the numerator is one less than in the paper order to work with all
* positive q.
*/
const RealType H(const RealType x)
{
const RealType log_x = std::log(x);
return expxm1bx((1.0 - q) * log_x) * log_x;
}
/** log(1 + x) / x */
static RealType
log1pxbx(const RealType x)
{
return (std::abs(x) > epsilon)
? std::log1p(x) / x
: 1.0 - x * ((1/2.0) - x * ((1/3.0) - x * (1/4.0)));
}
/** The inverse function of H(x) */
const RealType H_inv(const RealType x)
{
const RealType t = std::max(-1.0, x * (1.0 - q));
return std::exp(log1pxbx(t) * x);
}
/** That hat function h(x) = 1 / (x ^ q) */
const RealType h(const RealType x)
{
return std::exp(-q * std::log(x));
}
static constexpr RealType epsilon = 1e-8;
IntType n; ///< Number of elements
RealType q; ///< Exponent
RealType H_x1; ///< H(x_1)
RealType H_n; ///< H(n)
std::uniform_real_distribution<RealType> dist; ///< [H(x_1), H(n)]
};
The following line in your code is executed n times for each call to zipf():
sum_prob = sum_prob + c / pow((double) i, alpha);
It is regrettable that it is necessary to call the pow() function because, internally, this function sums not one but two Taylor series [considering that pow(x, alpha) == exp(alpha*log(x))]. If alpha is an integer, of course, then you can speed the code up a lot by replacing pow() with simple multiplication. If alpha is a rational number, then you may be able to speed the code up to a lesser degree by coding a Newton-Raphson iteration to take the place of the two Taylor series. If the last condition holds, please advise.
Fortunately, you have indicated that alpha does not change. Can you not speed the code up a lot by preparing a table of pow((double) i, alpha), then letting zipf() look numbers up the table? That way, zipf() would not have to call pow() at all. I suspect that this would save significant time.
Yet further improvements are possible. What if you factored a function sumprob() out of zipf()? Could you not prepare an even more aggressive look-up table for sumprob()'s use?
Maybe some of these ideas will move you in the right direction. See what you cannot do with them.
Update: I see that your question as now revised may not be able to use this answer. From the present point, your question may resolve into a question in complex variable theory. Such are often not easy questions, as you know. It may be that a sufficiently clever mathematician has discovered a relevant recurrence relation or some trick like the normal distribution's Box-Muller technique but, if so, I am not acquainted with the technique. Good luck. (It probably does not matter to you but, in case it does, the late N. N. Lebedev's excellent 1972 book Special Functions and Their Applications is available in English translation from the Russian in an inexpensive paperback edition. If you really, really wanted to crack this problem, you might read Lebedev next -- but, of course, that is a desperate measure, isn't it?)
As a complement to the very nice rejection-inversion implementation given above, here's a C++ class, with the same API, that is simpler and faster for a small number of bins, only. On my machine, its about 2.3x faster for N=300. It's faster because it performs a direct table lookup, instead of computing logs and powers. The table eats cache, though... Making a guess based on the size of my CPU's d-cache, I imagine that the proper rejection-inversion algo given above will become faster for something around N=35K, maybe. Also, initializing the table requires a call to std::pow() for each bin, so this wins performance only if you are drawing more than N values out of it. Otherwise, rejection-inversion is faster. Choose wisely.
(I've set up the API so it looks a lot like what the std::c++ standards committee might come up with.)
/**
* Example usage:
*
* std::random_device rd;
* std::mt19937 gen(rd());
* zipf_table_distribution<> zipf(300);
*
* for (int i = 0; i < 100; i++)
* printf("draw %d %d\n", i, zipf(gen));
*/
template<class IntType = unsigned long, class RealType = double>
class zipf_table_distribution
{
public:
typedef IntType result_type;
static_assert(std::numeric_limits<IntType>::is_integer, "");
static_assert(!std::numeric_limits<RealType>::is_integer, "");
/// zipf_table_distribution(N, s)
/// Zipf distribution for `N` items, in the range `[1,N]` inclusive.
/// The distribution follows the power-law 1/n^s with exponent `s`.
/// This uses a table-lookup, and thus provides values more
/// quickly than zipf_distribution. However, the table can take
/// up a considerable amount of RAM, and initializing this table
/// can consume significant time.
zipf_table_distribution(const IntType n,
const RealType q=1.0) :
_n(init(n,q)),
_q(q),
_dist(_pdf.begin(), _pdf.end())
{}
void reset() {}
IntType operator()(std::mt19937& rng)
{
return _dist(rng);
}
/// Returns the parameter the distribution was constructed with.
RealType s() const { return _q; }
/// Returns the minimum value potentially generated by the distribution.
result_type min() const { return 1; }
/// Returns the maximum value potentially generated by the distribution.
result_type max() const { return _n; }
private:
std::vector<RealType> _pdf; ///< Prob. distribution
IntType _n; ///< Number of elements
RealType _q; ///< Exponent
std::discrete_distribution<IntType> _dist; ///< Draw generator
/** Initialize the probability mass function */
IntType init(const IntType n, const RealType q)
{
_pdf.reserve(n+1);
_pdf.emplace_back(0.0);
for (IntType i=1; i<=n; i++)
_pdf.emplace_back(std::pow((double) i, -q));
return n;
}
};
Here's a version that is 2x faster than drobilla's original post, plus it also supports non-zero deformation parameter q (aka Hurwicz q, q-series q or quantum group deformation q) and changes notation to conform to standard usage in number theory textbooks. Rigorously tested; see unit tests at https://github.com/opencog/cogutil/blob/master/tests/util/zipfUTest.cxxtest
Dual license MIT license, or Gnu Affero, please copy into the C++ standard as desired.
/**
* Zipf (Zeta) random distribution.
*
* Implementation taken from drobilla's May 24, 2017 answer to
* https://stackoverflow.com/questions/9983239/how-to-generate-zipf-distributed-numbers-efficiently
*
* That code is referenced with this:
* "Rejection-inversion to generate variates from monotone discrete
* distributions", Wolfgang Hörmann and Gerhard Derflinger
* ACM TOMACS 6.3 (1996): 169-184
*
* Note that the Hörmann & Derflinger paper, and the stackoverflow
* code base incorrectly names the paramater as `q`, when they mean `s`.
* Thier `q` has nothing to do with the q-series. The names in the code
* below conform to conventions.
*
* Example usage:
*
* std::random_device rd;
* std::mt19937 gen(rd());
* zipf_distribution<> zipf(300);
*
* for (int i = 0; i < 100; i++)
* printf("draw %d %d\n", i, zipf(gen));
*/
template<class IntType = unsigned long, class RealType = double>
class zipf_distribution
{
public:
typedef IntType result_type;
static_assert(std::numeric_limits<IntType>::is_integer, "");
static_assert(!std::numeric_limits<RealType>::is_integer, "");
/// zipf_distribution(N, s, q)
/// Zipf distribution for `N` items, in the range `[1,N]` inclusive.
/// The distribution follows the power-law 1/(n+q)^s with exponent
/// `s` and Hurwicz q-deformation `q`.
zipf_distribution(const IntType n=std::numeric_limits<IntType>::max(),
const RealType s=1.0,
const RealType q=0.0)
: n(n)
, _s(s)
, _q(q)
, oms(1.0-s)
, spole(abs(oms) < epsilon)
, rvs(spole ? 0.0 : 1.0/oms)
, H_x1(H(1.5) - h(1.0))
, H_n(H(n + 0.5))
, cut(1.0 - H_inv(H(1.5) - h(1.0)))
, dist(H_x1, H_n)
{
if (-0.5 >= q)
throw std::runtime_error("Range error: Parameter q must be greater than -0.5!");
}
void reset() {}
IntType operator()(std::mt19937& rng)
{
while (true)
{
const RealType u = dist(rng);
const RealType x = H_inv(u);
const IntType k = std::round(x);
if (k - x <= cut) return k;
if (u >= H(k + 0.5) - h(k))
return k;
}
}
/// Returns the parameter the distribution was constructed with.
RealType s() const { return _s; }
/// Returns the Hurwicz q-deformation parameter.
RealType q() const { return _q; }
/// Returns the minimum value potentially generated by the distribution.
result_type min() const { return 1; }
/// Returns the maximum value potentially generated by the distribution.
result_type max() const { return n; }
private:
IntType n; ///< Number of elements
RealType _s; ///< Exponent
RealType _q; ///< Deformation
RealType oms; ///< 1-s
bool spole; ///< true if s near 1.0
RealType rvs; ///< 1/(1-s)
RealType H_x1; ///< H(x_1)
RealType H_n; ///< H(n)
RealType cut; ///< rejection cut
std::uniform_real_distribution<RealType> dist; ///< [H(x_1), H(n)]
// This provides 16 decimal places of precision,
// i.e. good to (epsilon)^4 / 24 per expanions log, exp below.
static constexpr RealType epsilon = 2e-5;
/** (exp(x) - 1) / x */
static double
expxm1bx(const double x)
{
if (std::abs(x) > epsilon)
return std::expm1(x) / x;
return (1.0 + x/2.0 * (1.0 + x/3.0 * (1.0 + x/4.0)));
}
/** log(1 + x) / x */
static RealType
log1pxbx(const RealType x)
{
if (std::abs(x) > epsilon)
return std::log1p(x) / x;
return 1.0 - x * ((1/2.0) - x * ((1/3.0) - x * (1/4.0)));
}
/**
* The hat function h(x) = 1/(x+q)^s
*/
const RealType h(const RealType x)
{
return std::pow(x + _q, -_s);
}
/**
* H(x) is an integral of h(x).
* H(x) = [(x+q)^(1-s) - (1+q)^(1-s)] / (1-s)
* and if s==1 then
* H(x) = log(x+q) - log(1+q)
*
* Note that the numerator is one less than in the paper
* order to work with all s. Unfortunately, the naive
* implementation of the above hits numerical underflow
* when q is larger than 10 or so, so we split into
* different regimes.
*
* When q != 0, we shift back to what the paper defined:
* H(x) = (x+q)^{1-s} / (1-s)
* and for q != 0 and also s==1, use
* H(x) = [exp{(1-s) log(x+q)} - 1] / (1-s)
*/
const RealType H(const RealType x)
{
if (not spole)
return std::pow(x + _q, oms) / oms;
const RealType log_xpq = std::log(x + _q);
return log_xpq * expxm1bx(oms * log_xpq);
}
/**
* The inverse function of H(x).
* H^{-1}(y) = [(1-s)y + (1+q)^{1-s}]^{1/(1-s)} - q
* Same convergence issues as above; two regimes.
*
* For s far away from 1.0 use the paper version
* H^{-1}(y) = -q + (y(1-s))^{1/(1-s)}
*/
const RealType H_inv(const RealType y)
{
if (not spole)
return std::pow(y * oms, rvs) - _q;
return std::exp(y * log1pxbx(oms * y)) - _q;
}
};
In the meantime there is a faster way based on rejection inversion sampling, see code here.