Residual values such as 4e^-14 randomly appearing in Cplex LP model solution: How can I get rid of them? - linear-programming

While the result of the my linear programming model in Cplex seems to make sense, the q variable sometimes randomly (at least to me it seems random) shows tiny values such as 4e^-14. This doesn't have an effect on the decision variable but is still very irritating as I am not sure if something in my model isn't correct. You can see the results of the q variable with the mini residuals here: Results q variable. These residuals only started appearing in my model once I introduced binary variables.
q is defined as: dexpr float q [t in Years, i in Options] = (c[i] * (a[t+s[i]][i]-a[t+s[i]-1][i]));
a is a decision variable
This is a constraint q is subject to: q[i][t] == a[i] * p[i]* y[t][i])
Since y is a binary variable, q should either be the value of a[i] * p[i] or 0. This is why I am very irritated with the residual values.
Does anybody have any idea why these values appear and how to get rid of them? I have already spent a lot of time on this problem and no idea how to solve it.
Things I noticed while trying to solve it:
Turning all the input variables into integer variables doesn't change
anything
Turning q into an integer variable solves the problem, but ruins the model since a[i][t] needs to be a float variable
Adding a constraint making q >= 0 does not eliminate the negative residual values such as -4e^-14
Adding a constraint making q = 0 for a specific t helps eliminate the residual values there, but of course also ruins the model
Thank you very much for your help!

This is a tolerance issue. MIP solvers such as Cplex have a bunch of them. The ones in play here are integer feasibility tolerance (epint) and the feasibility tolerance (eprhs). You can tighten them, but I usually leave them as they are. Sometimes it helps to round results before printing them or just use fewer digits in the formatting of the output.

Related

How to optimize nonlinear funtion with some constraint in c++

I want to find a variable in C++ that allows a given nonlinear formula to have a maximum value in a constraint.
It is to calculate the maximum value of the formula below in the given constraint in C++.
You can also use the library.(e.g. Nlopt)
formula : ln(1+ax+by+c*z)
a, b, c are numbers input by the user
x, y, z are variables to be derived
variable constraint is that x, y, z are positive and x+y+z<=1
This can actually be transformed into a linear optimization problem.
max ln(1+ax+by+cz) <--> max (ax+by+cz) s.t. ax+by+cz > -1
This means that it is a linear optimization problem (with one more constraint) that you can easily handle with whatever C++ methods together with your Convex Optimization knowledge.
Reminders to write a good code:
Check the validity of input value.
Since the input value can be negative, you need to consider this circumstance which can yield different results.
P.S.
This problem seems to be Off Topic on SO.
If it is your homework, it is for your own good to write the code yourself. Besides, we do not have enough time to write that for you.
This should have been a comment if I had more reputation.

Different least square errors with armadillo functions

Hello stackoverflow community,
I have troubles in understanding a least-square-error-problem in the c++ armadillo package.
I have a matrix A with many more rows than columns (5000 to 100 for example) so it is overdetermined.
I want to find x so that A*x=b gives me the least square error.
If i use the solve function of armadillo on my data like "x = Solve(A,b)" the error of "(A*x-b)^2" is sometimes way to high.
If on the other hand I solve for x with the analytical form by "x = (A^T * A)^-1 *A^T * b" the results are always right.
The results for x in both cases can differ by 10 magnitudes.
I had thought that armadillo would use this analytical form in the background if the system is overdetermined.
Now I would like to understand why these two methods give such different results.
I wanted to give a short example program, but i can't reproduce this behavior with a short program.
I thought about giving the Matrix here, but with 5000 times 100 it's also very big. I can deliver the values for which this happens though if needed.
So as a short background.
The matrix I get from my program is a numerically solved reaction of a nonlinear oscillator in which I put information inside by wiggeling a parameter of this system.
Because the influence of this parameter on the system is small, the values of my different rows are very similar but never the same, otherwise armadillo should throw an error.
I'm still thinking that this is the problem, but the solve function never threw any error.
Another thing that confuses me is that in a short example program with a random matrix, the analytical form is waaay slower than the solve function.
But on my program, both are nearly identically fast.
I guess this has something to do with the numerical convergence of the pseudo inverse and the special case of my matrix, but for that i don't know enough about how armadillo works.
I hope someone can help me with that problem and thanks a lot in advance.
Thanks for the replies. I think i figured the problem out and wanted to give some feedback for everybody who runs into the same problem.
The Armadillo solve function gives me the x that minimizes (A*x-b)^2.
I looked at the values of x and they are sometimes in the magnitude of 10^13.
This comes from the fact that the rows of my matrix only slightly change. (So nearly linear dependent but not exactly).
Because of that i was in the numerical precision of my doubles and as a result my error sometimes jumped around.
If i use the rearranged analytical formular (A^T * A)*x = *A^T * b with the solve function this problem doesn't occur anymore because the fitted values of x are in the magnitude of 10^4. The least square error is a little bit higher but that is okay, as i want to avoid overfitting.
I now additionally added Tikhonov regularization by solving (A^T * A + lambda*Identity_Matrix)*x = *A^T * b with the solve function of armadillo.
Now the weight vectors are in the order of around 1 and the error nearly doesn't change compared to the formular without regularization.

stata: inequality constraint in xttobit

Is it possible to constrain parameters in Stata's xttobit to be non-negative? I read a paper where the authors said they did just that, and I am trying to work out how.
I know that you can constrain parameters to be strictly positive by exponentially transforming the variables (e.g. gen x1_e = exp(x1)) and then calling nlcom after estimation (e.g. nlcom exp(_b[x1:_y]) where y is the independent variable. (That may not be exactly right, but I am pretty sure the general idea is correct. Here is a similar question from Statlist re: nlsur).
But what would a non-negative constraint look like? I know that one way to proceed is by transforming the variables, for example squaring them. However, I tried this with the author's data and still found negative estimates from xttobit. Sorry if this is a trivial question, but it has me a little confused.
(Note: this was first posted on CV by mistake. Mea culpa.)
Update: It seems I misunderstand what transformation means. Suppose we want to estimate the following random effects model:
y_{it} = a + b*x_{it} + v_i + e_{it}
where v_i is the individual random effect for i and e_{it} is the idiosyncratic error.
From the first answer, would, say, an exponential transformation to constrain all coefficients to be positive look like:
y_{it} = exp(a) + exp(b)*x_{it} + v_i + e_{it}
?
I think your understanding of constraining parameters by transforming the associated variable is incorrect. You don't transform the variable, but rather you fit your model having reexpressed your model in terms of transformed parameters. For more details, see the FAQ at http://www.stata.com/support/faqs/statistics/regression-with-interval-constraints/, and be prepared to work harder on your problem than you might have expected to, since you will need to replace the use of xttobit with mlexp for the transformed parameterization of the tobit log-likelihood function.
With regard to the difference between non-negative and strictly positive constraints, for continuous parameters all such constraints are effectively non-negative, because (for reasonable parameterization) a strictly positive constraint can be made arbitrarily close to zero.

Problems to parse from IloBoolVarArray to Bool in CPLEX

I have a IloBoolVarArray in a MIP problem. When the solver has finished, I parse this variables to double but sometimes I get values very small as 1.3E-008 instead 0. My question is: Why? Is it a parsing problem only? Internally the solver has used this value, so is not the result trustworthy?
Thanks a lot.
CPLEX works with double precision floating point data internally. It has a tolerance parameter EpInt. If a variable x has values
0 <= x <= EpInt, or
1-EpInt <= x <= 1
Then CPLEX considers the value to be binary. The default value for EpInt is 10^-6, so your seeing solution values of 10^-8 is consistent with default behavior of CPLEX. Unless you really need exact integer values, then you should account for this when you pull solutions from CPLEX. One particularly bad thing you could do in C++ is
IloBoolVar x(env);
// ...
cplex.solve();
int bad_value = cplex.getValue(x); // BAD
int ok_value = cplex.getValue(x) + 0.5; // OK
Here, bad_value could be set to 0 even if the CPLEX solution has an effective value of 1. That's because CPLEX could have a value of 0.999999 which would be truncated to 0. The second assignment will reliably store the solution.
In the latest version of CPLEX, you can set the EpInt to 0 which will make CPLEX only consider only 0.0 and 1.0 as binary. If you do really need exact values of 0 or 1, then you should keep in mind the domains CPLEX is designed to work. If you are trying to use it to solve cryptology problems for example, you might not get good results, even with small instances.

GSL interpolation error, values must be x values must be monotonically increasing

Hi my problem is that my data set is monotonically increasing but towards the end the of the data it looks like it does below ,where some of the x[i-1] = x[i] as shown below. This causes an error to be raised in GSL because it thinks that the values are not monotonically increasing. Is there a solution, fix or work around for this problem?
the values are already double precision ,this particular data set starts at 9.86553e-06 and ends at .999999
would the only solution be to offset every value in a for loop?
0.999981
0.999981
0.999981
0.999982
0.999982
0.999983
0.999983
0.999983
0.999984
0.999984
0.999985
0.999985
0.999985
I had a similar issue. I had removed duplicates by a simple condition operator (if statement) and this did not affect the final result (checked by MatLab). Though, this might be a bit problem-specific.
If you've genuinely reached the limits of what double precision allows--your delta is < machine epsilon--then there is nothing you can do with the data as they are. The x data aren't monotonically increasing. Rather you'll have to go back to where they are generated and apply some kind of transform to them to make the differences bigger at the tails. Or you could multiply by a scalar factor and then interpolate between the x values on the fly; and then divide the factor back out when you are done.
Edit: tr(x) = (x-0.5)^3 might do reasonably well to space things out, or tr(x) = tan( (x-0.5)*pi ). Have to watch out for extreme values in the latter case though. And of course, these transformations might screw up the analysis you're trying to do so a scalar factor might be the answer--has to be a transformation under which your analysis is invariant, obviously. Adding a constant is also likely possible.