Nonnegative deviation variables in AMPL - linear-programming

I'm using AMPL and need to input my model that has nonnegative deviation variables (s+ - s-).
An example constraint is:
(x - 5) = (s+ - s-)

This is the way to do it:
var x;
var sp >= 0;
var sm >= 0;
s.t.
cons1:
(x - 5) = (sp - sm);
FYI, the AMPL book can be downloaded for free.

Related

What conventions does IPOPT use to construct its Lagrangian?

I am using IPOPT via Pyomo (the AMPL interface) to solve a simple problem and am trying to validate that the primal Lagrangian gradient is zero at the solution. I'm running the following script, in which I construct what I would expect to be the gradient of the Lagrangian with respect to primal variables.
import pyomo.environ as pyo
from pyomo.common.collections import ComponentMap
m = pyo.ConcreteModel()
m.ipopt_zL_out = pyo.Suffix(direction=pyo.Suffix.IMPORT)
m.ipopt_zU_out = pyo.Suffix(direction=pyo.Suffix.IMPORT)
m.ipopt_zL_in = pyo.Suffix(direction=pyo.Suffix.EXPORT)
m.ipopt_zU_in = pyo.Suffix(direction=pyo.Suffix.EXPORT)
m.dual = pyo.Suffix(direction=pyo.Suffix.IMPORT_EXPORT)
m.v1 = pyo.Var(initialize=-2.0)
m.v2 = pyo.Var(initialize=2.0)
m.v3 = pyo.Var(initialize=2.0)
m.v1.setlb(-10.0)
m.v2.setlb(1.5)
m.v1.setub(-1.0)
m.v2.setub(10.0)
m.eq_con = pyo.Constraint(expr=m.v1*m.v2*m.v3 - 2.0 == 0)
obj_factor = 1
m.obj = pyo.Objective(
expr=obj_factor*(m.v1**2 + m.v2**2 + m.v3**2),
sense=pyo.minimize,
)
solver = pyo.SolverFactory("ipopt")
solver.solve(m, tee=True)
grad_lag_map = ComponentMap()
grad_lag_map[m.v1] = (
(obj_factor*2*m.v1) + m.dual[m.eq_con]*m.v2*m.v3 +
m.ipopt_zL_out[m.v1] + m.ipopt_zU_out[m.v1]
)
grad_lag_map[m.v2] = (
(obj_factor*2*m.v2) + m.dual[m.eq_con]*m.v1*m.v3 +
m.ipopt_zL_out[m.v2] + m.ipopt_zU_out[m.v2]
)
grad_lag_map[m.v3] = (
(obj_factor*2*m.v3) + m.dual[m.eq_con]*m.v1*m.v2
)
for var, expr in grad_lag_map.items():
print(var.name, pyo.value(expr))
According to this, however, the gradient of the Lagrangian is not zero when constructed in this way. I can get the gradient of the Lagrangian to be zero by using the following lines to construct grad_lag_map
grad_lag_map[m.v1] = (
-(obj_factor*2*m.v1) + m.dual[m.eq_con]*m.v2*m.v3 +
m.ipopt_zL_out[m.v1] + m.ipopt_zU_out[m.v1]
)
grad_lag_map[m.v2] = (
-(obj_factor*2*m.v2) + m.dual[m.eq_con]*m.v1*m.v3 +
m.ipopt_zL_out[m.v2] + m.ipopt_zU_out[m.v2]
)
grad_lag_map[m.v3] = (
-(obj_factor*2*m.v3) + m.dual[m.eq_con]*m.v1*m.v2
)
With a minus sign in front of the objective gradient, the gradient of the Lagrangian is zero. This is surprising to me. I would not expect to see this factor of -1 for minimization problems. Can anybody confirm whether IPOPT constructs its Lagrangian with this -1 factor for minimization problems, or whether this is the artifact of some other convention I am unaware of?
This is the Gradient of the Lagrangian w.r.t. x computed in Ipopt (https://github.com/coin-or/Ipopt/blob/2b1a2f9a60fb3f8426b47edbe3b3520c7335d201/src/Algorithm/IpIpoptCalculatedQuantities.cpp#L2018-L2023):
tmp->Copy(*curr_grad_f());
tmp->AddTwoVectors(1., *curr_jac_cT_times_curr_y_c(), 1., *curr_jac_dT_times_curr_y_d(), 1.);
ip_nlp_->Px_L()->MultVector(-1., *z_L, 1., *tmp);
ip_nlp_->Px_U()->MultVector(1., *z_U, 1., *tmp);
This corresponds to an Ipopt-internal representation of a NLP, which has the form
min f(x) dual vars:
s.t. c(x) = 0, y_c
d(x) - s = 0, y_d
d_L <= s <= d_U, v_L, v_U
x_L <= x <= x_U z_L, z_U
The Lagragian for Ipopt is then
f(x) + y_c c(x) + y_d (d(x) - s) + v_L (d_L-s) + v_U (s-d_U) + z_L (x_L-x) + z_U (x-x_U)
and the gradient w.r.t. x is thus
f'(x) + y_c c'(x) + y_d d'(x) - z_L + z_U
The NLP that is used by most Ipopt interfaces is
min f(x) duals:
s.t. g_L <= g(x) <= g_U lambda
x_L <= x <= x_U z_L, z_U
The Gradient of the Lagrangian would be
f'(x) + lambda g'(x) - z_L + z_U
In your code, you have a wrong sign for z_L.

Modeling a binary constraint in AMPL - CPLEX

i have the following constraints
i tried to model it in AMPL using the following code:
var y {1..njobs} binary;
subject to overlap
{i in 1..njobs, j in i+1..njobs: i<>j}:
xi[i] + si[i] <= xi[j]+m*y[i];
subject to order
{i in 1..njobs, j in i+1..njobs: i<j}:
y[i] + y[j] = 1;
i'm new to this topic and seem to miss something in the code above. any suggestions?
According to the constraints, y has two indices, i and j, but your code only gives it a single index.
Should be something like:
var y {1..njobs,1..njobs} binary;
subject to overlap
{i in 1..njobs, j in i+1..njobs: i<>j}:
xi[i] + si[i] <= xi[j]+m*y[i,j];
subject to order
{i in 1..njobs, j in i+1..njobs: i<j}:
y[i,j] + y[j,i] = 1;
Currently the behaviour for when i = j is undefined. You may want to either add a constraint that defines the behaviour in that case, or exclude it from the index space when you declare y, e.g.:
var y {i in 1..njobs,j in 1..njobs: i <> j} binary;

Microsoft Solver Foundation code for intervals "lowerbound <= D_1 <= upperbound" OR "D_1 == 0" in C#

I am using Microsoft Solver Foundation version 3.0.2.10889 Express Edition for linear programming. I have no problems when using LP solver. However I am unable to set two interval constraints for one Decision at one time. What I would like is to set "D_1 == 0" when "lowerbound <= D_1 <= upperbound" was not satisfied.
I have done some search and found this:
Microsoft Solver Foundation for semi-integer
I tried to implement it, please see simplified code below. When I set the lowerbound for all decisions = 1, all decisions are resolved at least with value 1. To test the model more I set the lowerbound for all decisions = 5, hoping that I will get some decisions with result = 0. But the model is now infeasable.
It seems that the model does not work as I expected. lowerbound in the model still acts only as minimum value. Not in such a way that when lowerbound is infeasible, return 0. It implies binary does not have an effect on the model.
SolverContext context = SolverContext.GetContext();
context.ClearModel();
Model model = context.CreateModel();
Decision D_1 = new Decision(Domain.RealNonnegative, "D_1");
Decision D_2 = new Decision(Domain.RealNonnegative, "D_2");
model.AddDecisions(D_1, D_2);
Decision binary = new Decision(Domain.IntegerRange(0, 1), "binary");
model.AddDecisions(binary);
Term woodproduction = 0.5 * D_1 + 0.7 * D_2;
model.AddConstraint("constraint_woodproduction", woodproduction == 20.5);
Term lowerbound = 10;
Term upperbound = 100;
Term C_1 = lowerbound * binary <= D_1 <= upperbound * binary;
model.AddConstraint("constraint_1", C_1);
Term C_2 = lowerbound * binary <= D_2 <= upperbound * binary;
model.AddConstraint("constraint_2", C_2);
model.AddGoal("cost", GoalKind.Minimize, lpgoal);
Solution solution = context.Solve(new SimplexDirective
{
IterationLimit = -1,
GetInfeasibility = true, //GetSensitivity = true
});
Thank you for any feedback,
Zdenek

Pyomo: constraint with if statements

I am currently trying to solve this problem. I need to maximize the profit of this company.
That s the code I currently have:
from pyomo.environ import *
from pyomo.opt import *
opt = solvers.SolverFactory("ipopt")
model = ConcreteModel()
model.x1 = Var(within=NonNegativeIntegers)
model.x2 = Var(within=NonNegativeIntegers)
model.y1 = Var(within=NonNegativeIntegers)
model.y2 = Var(within=NonNegativeIntegers)
model.b1 = Var(within=Boolean)
model.b2 = Var(within=Boolean)
model.c1 = Constraint(expr = model.x1 + model.x2 + model.y1 + model.y2 <= 7000)
model.c2 = Constraint(expr = 2*model.x1 + 2*model.x2 + model.y1 + model.y2 <= 10000)
model.c3 = Constraint(expr = model.x1 <= 2000)
model.c4 = Constraint(expr = model.x2 <= 1000)
model.c5 = Constraint(expr = model.y1 <= 2000)
model.c6 = Constraint(expr = model.y2 <= 3000)
model.z = Objective(expr= (150*model.x1 + 180*model.x2*model.b1 + 100*model.y1 + 110*model.y2*model.b2), sense=maximize)
results = opt.solve(model)
This is the code I tried to write for my constraint which is then only using the first slope as long as it does not exceed 2000 products:
def ObjRule(model):
if model.x1 >= 2000:
return model.b1==1
if model.x2 >= 2000:
return model.b2 == 1`
If someone would have a hint, how I could proceed that would be great.
thank you in advance,
Patrick
In Pyomo, rules are not callbacks sent to a solver. They are called once for each index to obtain a static set of expressions. This set of expressions is what is sent to a solver. Any if-logic you use inside of rules should not involve the values of variables (unless it is based on the initial value of a variable, in which case you would wrap the variable in value() wherever you use it outside of the main expression that is returned).
If you want to model a piecewise function, you need to apply some kind of modeling trick to do so. In some cases this involves introducing discrete variables (see examples for the Piecewise component), in other cases it does not (for instance when maximizing a piecewise function that can be expressed as the min of a finite number of affine functions).

Implementation of a softmax activation function for neural networks

I am using a Softmax activation function in the last layer of a neural network. But I have problems with a safe implementation of this function.
A naive implementation would be this one:
Vector y = mlp(x); // output of the neural network without softmax activation function
for(int f = 0; f < y.rows(); f++)
y(f) = exp(y(f));
y /= y.sum();
This does not work very well for > 100 hidden nodes because the y will be NaN in many cases (if y(f) > 709, exp(y(f)) will return inf). I came up with this version:
Vector y = mlp(x); // output of the neural network without softmax activation function
for(int f = 0; f < y.rows(); f++)
y(f) = safeExp(y(f), y.rows());
y /= y.sum();
where safeExp is defined as
double safeExp(double x, int div)
{
static const double maxX = std::log(std::numeric_limits<double>::max());
const double max = maxX / (double) div;
if(x > max)
x = max;
return std::exp(x);
}
This function limits the input of exp. In most of the cases this works but not in all cases and I did not really manage to find out in which cases it does not work. When I have 800 hidden neurons in the previous layer it does not work at all.
However, even if this worked I somehow "distort" the result of the ANN. Can you think of any other way to calculate the correct solution? Are there any C++ libraries or tricks that I can use to calculate the exact output of this ANN?
edit: The solution provided by Itamar Katz is:
Vector y = mlp(x); // output of the neural network without softmax activation function
double ymax = maximal component of y
for(int f = 0; f < y.rows(); f++)
y(f) = exp(y(f) - ymax);
y /= y.sum();
And it really is mathematically the same. In practice however, some small values become 0 because of the floating point precision. I wonder why nobody ever writes these implementation details down in textbooks.
First go to log scale, i.e calculate log(y) instead of y. The log of the numerator is trivial. In order to calculate the log of the denominator, you can use the following 'trick': http://lingpipe-blog.com/2009/06/25/log-sum-of-exponentials/
I know it's already answered but I'll post here a step-by-step anyway.
put on log:
zj = wj . x + bj
oj = exp(zj)/sum_i{ exp(zi) }
log oj = zj - log sum_i{ exp(zi) }
Let m be the max_i { zi } use the log-sum-exp trick:
log oj = zj - log {sum_i { exp(zi + m - m)}}
= zj - log {sum_i { exp(m) exp(zi - m) }},
= zj - log {exp(m) sum_i {exp(zi - m)}}
= zj - m - log {sum_i { exp(zi - m)}}
the term exp(zi-m) can suffer underflow if m is much greater than other z_i, but that's ok since this means z_i is irrelevant on the softmax output after normalization. final results is:
oj = exp (zj - m - log{sum_i{exp(zi-m)}})