How to include if-statement with decision variables in cplex constraints properly - if-statement

This is similar to a problem of moving from a decentralized system to a centralized one. Therefore, I want to identify the optimal locations to use as centralized points and the locations that need to be closed. These are my binary decision variables Xi and Yj.
I have two constraints that include an if-statement with decision variables. I have read that in this case I must use logical constraints, so I did.
forall (i in Drives, j in Locations)(Y[j]==1 && Distance[j][i]<=20) => X[i]==0;
I want this constraint to say that if a location j is chosen (Yj = 1) and if the distance between i and j is less than 20 , then => I want to close location i (Xi = 0)
forall (j in Locations, k in Locations)(Y[j]==1 && Distance2[j][k]<=40) => Y[k]==0;
Similarly, this constraint says that if a location j is chosen (Yj = 1) and if the distance between 2 potential locations is less than 40, then I do not want to choose location k (Yk = 0)
The model gives a result but as I check the numbers, it seems to ignore these 2 constraints. So, something is not working properly in the terms used.

The constraints look mostly correct to me. What looks a bit fishy in the second constraint is that you don't exclude the case j==k. If Y[j]==1 then probably Distance2[j][j]==0 and thus the second constraint implies Y[j]==0. A contradiction!
Are you sure that CPLEX claims your solution optimal? Or are you maybe looking at a relaxed solution (which would then be allowed to violate constraints)?
Assuming Distance is data and not a decision variable, your constraints could be written in a more efficient way. For example the first one:
forall(i in Drives)
forall(j in Locations : Distance[j][i] <= 20)
X[i] <= 1 - Y[j]; // If Y[j]==1 then the right-hand side becomes zero and forces X[i]==0
Similary, the second constraint could be written as
forall(j in Locations)
forall(k in Locations : k != j && Distance2[j][k] <= 40)
Y[k] <= 1 - Y[j]; // If Y[j]==1 then the right-hand side becomes zero and forces Y[k]==0
Can you try with these more explicit constraints or at least with excluding the case j==k in the second constraint?

Related

How to write constraint with sum of absolutes in Integer Programming?

I found a solution for just one term here.
How can we formulate constraints of the form
|x1- a1| +|x2-a2| + .... + |xn - an| >= K
in Mixed Integer Linear Programming ?
Let's write this as:
sum(i, |x[i]-a[i]|) >= K
This is non-convex and needs special treatment. Sorry: this is not very simple.
One way to model this is:
x[i] - a[i] = zplus[i] - zmin[i]
sum(i, zplus[i] + zmin[i]) >= K
zplus[i] <= δ[i]*M[i]
zmin[i] <= (1-δ[i])*M[i]
zmin[i],zplus[i] >= 0
δ[i] ∈ {0,1}
Here M[i] are large enough constants. This needs some thought to find good values. Basically we want: M[i]=max(|x[i] - a[i]|).
Alternative formulations are possible using indicator constraints and SOS1 variables. Some modeling tools and solvers have direct support for absolute values.

C++ idiom for loop edges

Problem 1: suppose you have an array of n floats and you want to calculate an array of n running averages over three elements. The middle part would be straightforward:
for (int i=0; i<n; i++)
b[i] = (a[i-1] + a[i] + a[i+1])/3.
But you need to have separate code to handle the cases i==0 and i==(n-1). This is often done with extra code before the loop, extra code after the loop, and adjusting the loop range, e.g.
b[0] = (a[0] + a[1])/2.
for (int i=1; i<n-1; i++)
b[i] = (a[i-1] + a[i] + a[i+1])/3.;
b[n-1] = (a[n-1] + a[n-2])/2.
Even that is not enough, because the cases of n<3 need to be handled separately.
Problem 2. You are reading a variable-length code from an array (say implementing a UTF-8 to UTF-32 converter). The code reads a byte, and accordingly may read one or more bytes to determine the output. However, before each such step, it also needs to check if the end of the input array has been reached, and if so, perhaps load more data into a buffer, or terminate with an error.
Both of these problems are cases of loops where the interior of the loop can be expressed neatly, but the edges need special handling. I find these sort of problems the most prone to error and to messy programming. So here's my question:
Are there any C++ idioms which generalize wrapping such loop patterns in a clean way?
Efficiently and elegantly handling boundary conditions is troublesome in any programming language -- C++ has no magic hammer for this. This is a common problem with applying convolution filters to signals / images -- what do you do at the image boundaries where your kernel goes outside the image support?
There are generally two things you are trying to avoid:
out of bounds array indexing (which you must avoid), and
special computation (which is inelegant and results in slower code due to extra branching).
There are usually three approaches:
Avoid the boundaries -- this is the simplest approach and is often sufficient since the boundary case make up a tiny slice of the problem and can be ignored.
Extend the bounds of your buffer -- add extra columns/rows of padding to the array so the same code used in the general case can be used at the edges. Of course this raises the problem of what values to place in the padding -- this often depends on the problem you are solving and is considered in the next approach.
Special computation at the boundary -- this is what you do in your example. Of course how you do this is problem dependent and raises a similar issue as the previous approach -- what is the correct thing to do when my filter (in your case an averaging filter) extends beyond the array support? What should I consider the values to be outside the array support? Most image filter libraries provide some form of extrapolation options -- for example:
assume a value zero or some other constant (define a[i] = 0 if i < 0 || i >= n),
replicate the boundary value (e.g. a[i] = a[0] if i < 0 and a[i] = a[n-1] if i >= n)
wrap the value (define a[i] = a[(i + n) % n] -- makes sense of some cases -- e.g, texture filters)
mirror the border ((e.g. a[i] = a[abs(i+1)] if i < 0 and a[i] = a[2n - i -1] if i >= n)
other special case (what you do)
When reasonable, its best to separate the special case from the general case (like you do) to avoid inelegant and slow general cases. One could always wrap/hide the special case and general case in a function or operator (e.g., overload operator[]) , but this only sugar coats the problem like any contrived C++ idiom would. In a multi-threaded environment (e.g. CUDA / SIMD) you can do some other tricks be preloading out-of-bounds values, but you are still stuck with the same problem.
This is why programmers use the phrase "edge case" in referring any kind of special case programming and is often a time sink and a source for annoying errors. Some languages that efficiently support exception handling for out of bounds array indexing (e.g. Ada) can make for prettier code, but still cause the same pain.
Unfortunately the answer is NO.
There is no C++ idioms which generalize wrapping such loop patterns in a clean way!
You can do it by making something like this, but you still need to adjust window size.
template <typename T, int N>
T subscript(T (&data)[N], int index) {
if (index < 0 || index >= N) {
return 0;
}
return data[index];
}
for (int i = 0; i < n; ++i) {
b[i] = (subscript(a, i - 1) + subscript(a, i) + subscript(a, i + 1)) / 3.
}

What does Big M method do in constraints when converting nonlinear programming into linear programming?

I got a question regarding this constraints in the paper. This paper says it used big M method in order to make non-linear programming model into LP. I get that big number M1is a huge number, but I don't get what big number M1 really does on the constraints. Would you guys give me some insight on the use of the big M in this constraints?
Below is constraints with big number M1.
The paper says these constraints are
when K[m][i] = p[i]*x[m][i],
maximize sum(m in M, i in I) (K[m][i]-c[i]*x[m][i]
K[m][i]-M[1]*(1-x[m][i]) <= p[i]
K[m][i]+M[1]*(1-x[m][i]) >= p[i]
K[m][i]-M[1]*x[m][i] <= 0
it originally looked like this in non linear programming
maximize sum(m in M, i in I)(p[i]-c[i])*x[m][i]
So, basically, converting nonlinear programming into linear programming led to a little change in some decision variables and 3 additional constraints with big number M.
Here is another constraint that includes big number M.
sum (j in J) b[i][j]*p[j]-p[i]<= M[1]*y[i]
which originally looked like
p[i]<sum (j in J) b[i][j]*p[j], if y[i]==1
Here is the last constraint with big number M
(r[m][j]=p[j])*b[i][j]*x[m][i] >= -y[i]*m[1]
which was
(r[m][j]-p[j])*b[i][j]*x[m][i](1-y[i])>=0
in nonlinear program.
I really want to know what does big M do in the model.
It would be really appreciated if anyone gives me some insight.
Thank you.
As you said, the big-M is used to model the non-linear constraint
K[m][i] = p[i] * x[m][i]
in case x is a binary variable. The assumption is that M is an upper bound on K[m][i] and that K[m][i] is a non-negative variable, i.e. 0 <= K[m][i] <= M. Also p is assumed to be non-negative.
Since x[m][i] is binary, we can have two cases in a feasible solution:
x[m][i] = 0. In that case the product p[i] * x[m][i] is 0 and thus K[m][i] should be zero as well. This is enforced by constraint K[m][i] - M * x[m][i] <= 0 which in this case becomes just K[m][i] <= 0. The two other constraints involving M become redundant in this case. For example, the first constraint reduces to K[m][i] <= p[i] + M which is always true since M is an upper bound on K[m][i] and p is non-negative.
x[m][i] = 1. In that case the product p[i] * x[m][i] is just p[i] and the first two constraints involving M become K[m][i] <= p[i] and K[m][i] >= p[i] (which is equivalent to K[m][i] = p[i]). The last constraint involving M becomes K[m][i] <= M which is redundant since M is an upper bound on K[m][i].
So the role of M here is to "enable/disable" certain constraints depending on the value of x.
to model logical constraints you may either use logical constraints or rely on big M
https://www.ibm.com/support/pages/difference-between-using-indicator-constraints-and-big-m-formulation
I tend to suggest logical constraint as the default choice.
In https://www.linkedin.com/pulse/how-opl-alex-fleischer/
let me share the example
How to multiply a decision variable by a boolean decision variable in CPLEX ?
// suppose we want b * x <= 7
dvar int x in 2..10;
dvar boolean b;
dvar int bx;
maximize x;
subject to
{
// Linearization
bx<=7;
2*b<=bx;
bx<=10*b;
bx<=x-2*(1-b);
bx>=x-10*(1-b);
// if we use CP we could write directly
// b*x<=7
// or rely on logical constraints within CPLEX
// (b==1) => (bx==x);
// (b==0) => (bx==0);
}

Loop starting from 0 or 1? Which one and why?

Most of the for loops I have read/written start from 0 and to be fair most of the code I have read are used for embedded systems and they were in C/C++. In embedded systems the readability is not as important as code efficiency in some cases. Therefore, I am not sure which of the following cases would be a better choice:
version 1
for(i = 0; i < allowedNumberOfIteration; i++)
{
//something that may take from 1 iteration to allowedNumberOfIteration before it happens
if(somethingHappened)
{
if(i + 1 > maxIteration)
{
maxIteration = i + 1;
}
}
}
Version 2
for(i = 1; i <= allowedNumberOfIteration; i++)
{
//something that may take from 1 iteration to allowedNumberOfIteration before it happens
if(somethingHappened)
{
if(i > maxIteration)
{
maxIteration = i;
}
}
}
Why first version is better in my opinion:
1.Most loops starts with 0. So, maybe experienced programmers find it to be better if it starts from 0.
Why second version is better in my opinion:
To be fair if there was an array in the function starting from 0
would be great because the index of arrays start from zero. But in
this part of the code no arrays are used.
Beside the second version looks simpler because you do not have to think about the '+1'.
Things I do not know
1) Is there any performance difference?
2) Which version is better?
3) Are there any other aspect that should be considered in deciding the starting point?
4) Am I worrying too much?
1) No
2) Neither
3) Arrays in C and C++ are zero-based.
4) Yes.
Arrays of all forms in C++ are zero-based. I.e. their index start at zero and goes up to the size of the array minus one. For example an array of five elements will have the indexes 0 to 4 (inclusive).
That is why most loops in C++ are starting at zero.
As for your specific list of questions, for 1 there might be a performance difference. If you start a loop at 1 then you might need to subtract 1 in each iterator if you use the value as an array index. Or if you increase the size of the arrays then you use more memory.
For 2 it really depends on what you're iterating over. Is it over array indexes, then the loop starting at zero is clearly better. But you might need to start a loop at any value, it really depends on what you're doing and the problem you're trying to solve.
For 3, what you need to consider is what you're using the loop for.
And 4, maybe a little. ;)
This argument comes from a small, 3-page note by the famous computer scientist Dijkstra (the one from Dijkstra's algorithm). In it, he lays out the reasons we might index starting at zero, and the story begins with trying to iterate over a sequence of natural numbers (meaning a sequence on the number line 0, 1, 2, 3, ...).
There are 4 possibilities to index 2, 3, ..., 12.
a.) 2 <= i < 13
b.) 1 < i <= 12
c.) 2 <= i <= 12
d.) 1 < i < 13
He mentions that a.) and b.) have the advantage that the difference of the two bounds equals the number of elements in the sequence. He also mentions if two sequences are adjacent, the upper bound of one equals the lower bound of the other. He says this doesn't help decide between a.) or b.) so he will start afresh.
He immediately removes b.) and d.) from the list since, if we were to start a natural sequence with zero, they would have bounds outside the natural numbers (-1), which is "ugly". He completes the observation by saying we prefer <= for the lower bound -- leaving us with a.) and c.).
For an empty set, he notes that in b.) and c.) will have -1 for its upper bound, which is also "ugly".
All three of these observations leads to the convention to represent a sequence of natural numbers with a.), and that indeed is how most people write a for that goes over an array: for(int i = 0; i < size; ++i). We include the lower bound (i <= 0), and we exclude the upper bound (i < size).
If you were to use something like for(int i = 0; i <= iterations - 1; ++i) to do i iterations, you can see the ugliness he refers to in the case of the empty set. iterations - 1 would be -1 for zero iterations.
So by convention, we use a.) and due to indexing arrays at zero, we start a huge number for for loops with i = 0. Then, we reason parsimony - might as well do different things the exact same way if there is no other reason to do one or the other a different way.
Now, if we were to use a.) with 1-based indexing into an array instead of 0-based indexing, we would get for(int i = 1; i < size + 1; ++i). The + 1 is "ugly", so we prefer to start our range with i = 0.
In conclusion, you should do a for iterations times with for(int i = 0; i < iterations; ++i). Something like for(int i = 1; i <= iterations; ++i) is fairly understandable and works, but is there any good reason to add a different way to loop iterations times? Just use the same pattern as when indexing an array. In other words, use 0 <= i < size. Worse, the loop based on 1 <= i <= iterations doesn't have all the reasons Dijkstra came up with to support using 0 <= i < iterations as a convention.
You're not worrying too much. In fact, Dijkstra himself wondered the exact same question as has pretty much any serious programmer. Tuning your style like a craftsman who loves their trade is the ground a great programmer stands on. Pursuing parsimony and writing code the way others tend to write code (including yourself - the looping of an array!) are both sane, great things to pursue.
Due to this convention, when I see for(i = 1, I notice a departure from a convention. I am then more cautious around that code, thinking the logic within the for might depend on starting at 1 instead of 0. This is slight, but there's no reason to add that possibility when a convention is so widely used. If you happen to have a large for body, this complaint becomes less slight.
To understand why starting at one makes no sense, consider taking the argument to its natural conclusion - the argument of "but it makes sense to me!": You can start i at anything! If we free ourselves from convention, why not loop for(int i = 5; i <= iterations + 4; ++i)? Or for(int i = -5; i > -iterations - 5; --i)? Just do it the way a majority of programmers do in the majority of cases, and save being different for when there's a good reason - the difference signals to the programmer reading your code that the body of the for contains something unusual. With the standard way, we know the for is either indexing/ordering/doing arithmetic with a sequence starting at 0 or executing some logic iterations times in a row.
Note how prevalent this convention is too. In C++, every standard container iterates between [start, end), which corresponds to a.) above. There, they do it so that the end condition can be iter != end, but the fact that we already do the logic one way and that that one way has no immediate drawbacks flows naturally into the argument of "Why do it two different ways when we already do it this way in this context?" In his little paper, Dijkstra also notes a language called Mesa that can do a.), b.), c.), or d.) with particular syntax. He claims that there, a.) has won out in practice, and the others are associated with the cause of bugs. He then laments how FORTRAN indexes at 1 and how PASCAL took on c.) by convention.

Linear program objective function depends on sign of variable

I'm trying to find Q[i] to maximize
Sum[Q[i] F[i] - C[i], {i, 1, n}]
subject to some linear constraints. The problem: C[i] is a function of Q[i] but isn't linear. It's equal to Q[i] * Cp if Q[i] >= 0, and -Q[i] * Cn if Q[i] < 0 (basically a cost term that's different if Q[i] is positive vs negative).
I suspect I need to use some version of integer programming to reformulate this properly but can't see how to get there. Can anyone point me the right way, or maybe just tell me this can't be done? :)
Here is a Mixed Integer formulation with some additional binary variables:
We use variable splitting to have two components of Q (positive and negative). Using a binary variable we make sure only one of those components is nonzero. This will require new continuous variables q+ and q- and new binary variables delta.
The constant M+ and M- are an upper bound on q+, q-. Make them as small as possible (100 or 1000 is better than 1e6 or 1e7).
Now there is something we can exploit. The objective will push down the C term in order to maximize the total objective. This means we can drop the equations with the binary variable, as automatically only one of q-, q+ will be nonzero. I.e. if Q=-10, it will prefer q+ = 0, q- = 10 above q+ = 2, q- = 12. So the final model is actually a straight LP: