I'm having some trouble getting my head around why LPSOLVE can't find a solution to this.
min:;
x = 1000 ;
5000 - 1 x + 500 y = 0;
From inspection, we can see that x = 1000, and y = -8.
LPSOLVE states that the model is infeasible.
However, when inverting the sign of y, ie:
min:;
x = 1000 ;
5000 - 1 x - 500 y = 0;
LPSOLVE correctly calculates x = 1000, y = 8.
Or, as one would expect, if substituting 1000 in for x,
min:;
x = 1000 ;
5000 - 1000 + 500 y = 0;
also solves correctly, y = -8.
Can anyone shed any light on why the original snippet cannot solve?
Thanks
I found a solution. I'm not sure why it requires this (maybe someone can elaborate) but adding constraints gives the correct result:
x >= -Inf;
y >= -Inf;
Related
I have a value x, which is a combination of decision variables.
I need to calculate a cost, which only triggers if x > 100. So cost = MAX(x - 100, 0) * 20.
Is there any way to do this in linear programming?
I've tried creating two binary variables (y1 & y2), in which y1 = 1 when x <= 100 & y2 = 1 when x > 100 & y1 + y2 = 1, from this website - https://uk.mathworks.com/matlabcentral/answers/693740-linear-programming-with-conditional-constraints. However, my excel solver is still giving non-linearity complaints...
Any advice on how I can fix this?
The objective
min cost = max(x-100,0)*20
can be implemented in an LP as:
min cost = y*20
y >= x - 100
x >= 0, y >= 0
There is no need for binary variables.
I want to write a simple LP using docplex. Suppose I have three variables: x, y, and z, the constraint is 4x + 9y - 18.7 <= z. I wrote the constraint with the code model.add_constraint(4 * x + 9 * y - 18.7 <= z). Then I set minimize z as my objective by model.minimize(z).
After solving the model, I get the result of z = 0.000. Can anyone explain the result to me? I don't understand why 0 is the optimal value of this LP. I also tried to print the details of this model:
status = optimal
time = 0 s.
problem = LP
z: 0.000; None
objective: z
constraint: 4z+9y-18.700 <= z
When I tried to model.print_solution(), the program prints z: 0.000; None where I don't understand what does "None" mean, does that mean x and y are None?
Update: Forgot to mention, I created the variable using model.continuous_var()
Indeed if you do not give a range they are non negative.
Small example out of the zoo story:
from docplex.mp.model import Model
mdl = Model(name='buses')
nbbus40 = mdl.continuous_var(name='nbBus40')
nbbus30 = mdl.continuous_var(name='nbBus30')
mdl.add_constraint(nbbus40*40 + nbbus30*30 >= 300, 'kids')
mdl.minimize(nbbus40*500 + nbbus30*400)
mdl.solve(log_output=False,)
print("nbbus40.lb =",nbbus40.lb)
for v in mdl.iter_continuous_vars():
print(v," = ",v.solution_value)
mdlv2 = Model(name='buses2')
nbbus40v2 = mdlv2.continuous_var(-2,200,name='nbBus40')
nbbus30v2 = mdlv2.continuous_var(-2,200,name='nbBus30')
mdlv2.add_constraint(nbbus40v2*40 + nbbus30v2*30 >= 300, 'kids')
mdlv2.minimize(nbbus40v2*500 + nbbus30v2*400)
mdlv2.solve(log_output=False,)
print("nbbus40v2.lb =",nbbus40v2.lb)
for v in mdlv2.iter_continuous_vars():
print(v," = ",v.solution_value)
gives
nbbus40.lb = 0
nbBus40 = 7.5
nbBus30 = 0
nbbus40v2.lb = -2
nbBus40 = 9.0
nbBus30 = -2.0
Consider an interval of values [x, y] equally subdivided in n samples in the following way:
y can be greater, equal or less than x.
Now, we pick up a value z between x and y.
Question: what is the formula to compute the index i of z ? (if x = y, then the formula should return 0 or n-1) (I repeat: y can be greater, equal or less than x.)
For example: if x = - 5, y = -10 and n = 5, then for z = -7.5, i = 2 (if z = -7, i = 2 but if z = -8, i = 3).
You can compute the length of the interval as:
len = y - x
Then you can compute the increase per a single element
increase = len / n;
And now you have i = (z - x) / increase. In short you compute how much does the value increase per a single element and than you compute how many times this increase is needed to get from x to z.
EDIT: if you really require the solution in C++ take care to do all the calculations in double. Also please note the value of i should be an integer rounded down.
Answer logic(IN java):
i = Math.abs(Math.ceil(z - Math.min(x,y)));
if(x>y) high = x low = y
else high = y low = x
if(y>=x)
i = ceil((z-low+1)/(high-low+1)*n)-1
else i = ceil((high-z+1)/(high-low+1)*n)-1
I am using a Softmax activation function in the last layer of a neural network. But I have problems with a safe implementation of this function.
A naive implementation would be this one:
Vector y = mlp(x); // output of the neural network without softmax activation function
for(int f = 0; f < y.rows(); f++)
y(f) = exp(y(f));
y /= y.sum();
This does not work very well for > 100 hidden nodes because the y will be NaN in many cases (if y(f) > 709, exp(y(f)) will return inf). I came up with this version:
Vector y = mlp(x); // output of the neural network without softmax activation function
for(int f = 0; f < y.rows(); f++)
y(f) = safeExp(y(f), y.rows());
y /= y.sum();
where safeExp is defined as
double safeExp(double x, int div)
{
static const double maxX = std::log(std::numeric_limits<double>::max());
const double max = maxX / (double) div;
if(x > max)
x = max;
return std::exp(x);
}
This function limits the input of exp. In most of the cases this works but not in all cases and I did not really manage to find out in which cases it does not work. When I have 800 hidden neurons in the previous layer it does not work at all.
However, even if this worked I somehow "distort" the result of the ANN. Can you think of any other way to calculate the correct solution? Are there any C++ libraries or tricks that I can use to calculate the exact output of this ANN?
edit: The solution provided by Itamar Katz is:
Vector y = mlp(x); // output of the neural network without softmax activation function
double ymax = maximal component of y
for(int f = 0; f < y.rows(); f++)
y(f) = exp(y(f) - ymax);
y /= y.sum();
And it really is mathematically the same. In practice however, some small values become 0 because of the floating point precision. I wonder why nobody ever writes these implementation details down in textbooks.
First go to log scale, i.e calculate log(y) instead of y. The log of the numerator is trivial. In order to calculate the log of the denominator, you can use the following 'trick': http://lingpipe-blog.com/2009/06/25/log-sum-of-exponentials/
I know it's already answered but I'll post here a step-by-step anyway.
put on log:
zj = wj . x + bj
oj = exp(zj)/sum_i{ exp(zi) }
log oj = zj - log sum_i{ exp(zi) }
Let m be the max_i { zi } use the log-sum-exp trick:
log oj = zj - log {sum_i { exp(zi + m - m)}}
= zj - log {sum_i { exp(m) exp(zi - m) }},
= zj - log {exp(m) sum_i {exp(zi - m)}}
= zj - m - log {sum_i { exp(zi - m)}}
the term exp(zi-m) can suffer underflow if m is much greater than other z_i, but that's ok since this means z_i is irrelevant on the softmax output after normalization. final results is:
oj = exp (zj - m - log{sum_i{exp(zi-m)}})
So i'm implementing a heuristic algorithm, and i've come across this function.
I have an array of 1 to n (0 to n-1 on C, w/e). I want to choose a number of elements i'll copy to another array. Given a parameter y, (0 < y <= 1), i want to have a distribution of numbers whose average is (y * n). That means that whenever i call this function, it gives me a number, between 0 and n, and the average of these numbers is y*n.
According to the author, "l" is a random number: 0 < l < n . On my test code its currently generating 0 <= l <= n. And i had the right code, but i'm messing with this for hours now, and i'm lazy to code it back.
So i coded the first part of the function, for y <= 0.5
I set y to 0.2, and n to 100. That means it had to return a number between 0 and 99, with average 20.
And the results aren't between 0 and n, but some floats. And the bigger n is, smaller this float is.
This is the C test code. "x" is the "l" parameter.
//hate how code tag works, it's not even working now
int n = 100;
float y = 0.2;
float n_copy;
for(int i = 0 ; i < 20 ; i++)
{
float x = (float) (rand()/(float)RAND_MAX); // 0 <= x <= 1
x = x * n; // 0 <= x <= n
float p1 = (1 - y) / (n*y);
float p2 = (1 - ( x / n ));
float exp = (1 - (2*y)) / y;
p2 = pow(p2, exp);
n_copy = p1 * p2;
printf("%.5f\n", n_copy);
}
And here are some results (5 decimals truncated):
0.03354
0.00484
0.00003
0.00029
0.00020
0.00028
0.00263
0.01619
0.00032
0.00000
0.03598
0.03975
0.00704
0.00176
0.00001
0.01333
0.03396
0.02795
0.00005
0.00860
The article is:
http://www.scribd.com/doc/3097936/cAS-The-Cunning-Ant-System
pages 6 and 7.
or search "cAS: cunning ant system" on google.
So what am i doing wrong? i don't believe the author is wrong, because there are more than 5 papers describing this same function.
all my internets to whoever helps me. This is important to my work.
Thanks :)
You may misunderstand what is expected of you.
Given a (properly normalized) PDF, and wanting to throw a random distribution consistent with it, you form the Cumulative Probability Distribution (CDF) by integrating the PDF, then invert the CDF, and use a uniform random predicate as the argument of the inverted function.
A little more detail.
f_s(l) is the PDF, and has been normalized on [0,n).
Now you integrate it to form the CDF
g_s(l') = \int_0^{l'} dl f_s(l)
Note that this is a definite integral to an unspecified endpoint which I have called l'. The CDF is accordingly a function of l'. Assuming we have the normalization right, g_s(N) = 1.0. If this is not so we apply a simple coefficient to fix it.
Next invert the CDF and call the result G^{-1}(x). For this you'll probably want to choose a particular value of gamma.
Then throw uniform random number on [0,n), and use those as the argument, x, to G^{-1}. The result should lie between [0,1), and should be distributed according to f_s.
Like Justin said, you can use a computer algebra system for the math.
dmckee is actually correct, but I thought that I would elaborate more and try to explain away some of the confusion here. I could definitely fail. f_s(l), the function you have in your pretty formula above, is the probability distribution function. It tells you, for a given input l between 0 and n, the probability that l is the segment length. The sum (integral) for all values between 0 and n should be equal to 1.
The graph at the top of page 7 confuses this point. It plots l vs. f_s(l), but you have to watch out for the stray factors it puts on the side. You notice that the values on the bottom go from 0 to 1, but there is a factor of x n on the side, which means that the l values actually go from 0 to n. Also, on the y-axis there is a x 1/n which means these values don't actually go up to about 3, they go to 3/n.
So what do you do now? Well, you need to solve for the cumulative distribution function by integrating the probability distribution function over l which actually turns out to be not too bad (I did it with the Wolfram Mathematica Online Integrator by using x for l and using only the equation for y <= .5). That however was using an indefinite integral and you are really integration along x from 0 to l. If we set the resulting equation equal to some variable (z for instance), the goal now is to solve for l as a function of z. z here is a random number between 0 and 1. You can try using a symbolic solver for this part if you would like (I would). Then you have not only achieved your goal of being able to pick random ls from this distribution, you have also achieved nirvana.
A little more work done
I'll help a little bit more. I tried doing what I said about for y <= .5, but the symbolic algebra system I was using wasn't able to do the inversion (some other system might be able to). However, then I decided to try using the equation for .5 < y <= 1. This turns out to be much easier. If I change l to x in f_s(l) I get
y / n / (1 - y) * (x / n)^((2 * y - 1) / (1 - y))
Integrating this over x from 0 to l I got (using Mathematica's Online Integrator):
(l / n)^(y / (1 - y))
It doesn't get much nicer than that with this sort of thing. If I set this equal to z and solve for l I get:
l = n * z^(1 / y - 1) for .5 < y <= 1
One quick check is for y = 1. In this case, we get l = n no matter what z is. So far so good. Now, you just generate z (a random number between 0 and 1) and you get an l that is distributed as you desired for .5 < y <= 1. But wait, looking at the graph on page 7 you notice that the probability distribution function is symmetric. That means that we can use the above result to find the value for 0 < y <= .5. We just change l -> n-l and y -> 1-y and get
n - l = n * z^(1 / (1 - y) - 1)
l = n * (1 - z^(1 / (1 - y) - 1)) for 0 < y <= .5
Anyway, that should solve your problem unless I made some error somewhere. Good luck.
Given that for any values l, y, n as described, the terms you call p1 and p2 are both in [0,1) and exp is in [1,..) making pow(p2, exp) also in [0,1) thus I don't see how you'd ever get an output with the range [0,n)