Linear programming - multi-objective optimization. How to build single function - linear-programming

I have LP problem
It is similar to assignment problem for workers
I have three assignments with constraint scores(Constraint 1 is more important than constraint 2)
Constraint1 Constraint2
assign1 590 5
assign2 585 580
assign3 585 336
My current greedy solver will compare assignments by the first constraint. The best one becomes a solution. Solver will compare Constraint 2 if and only if he found two assignments with the same score value for previous constraint and so on.
For a given example in the first round assign2 and assign3 will be chosen because they have the same lowest constraint score. After second round solver choses assign3. So I need a cost function which will represent that behavior.
So I expect the
assign1_cost > assign2_cost > assign3_cost.
Is it posible to do?
I believe that I can apply some weighted math function or something like that.

There's two ways to do this simply, that I know of:
Assume that you can put an upper bound on each objective function (besides the first). Then you rewrite your objective as follows: (objective_1 * max_2 + objective_2) * max_3 + objective_3 and so on, and minimize said objective. For your example, say you know that objective_2 will be less than 1000. You can then solve, minimizing objective_1*1000 + objective_2. This may fail with some solvers if your upper bounds are too large.
This method requires n solver iterations, where n is the number of objectives, but is general - it doesn't require you to know upper bounds beforehand. Solve, minimizing objective_1 and ignoring the other objective functions. Add a constraint that objective_1 == <value you just got>. Solve, minimizing objective_2 and ignoring the remaining objective functions. Repeat until you've gone through every objective function.

Related

DoCPLEX Solving LP Problem Partially at a time

I am working on Linear Programming Problem with 800K Constraints and the problem takes 20 mins to solve but if I solve the problem for half horizon it just takes 1 min. Is there a way in DoCPLEX where I can solve for partial horizon and then use the solution to solve for other half of the problem without using a for-loop
Three suggestions:
load your problem as LP or SAV into cplex interactive optimizer and run display problem stats. This might show (or rule out) precision issues (ill-conditioned problem). Also it will output number of nonzeros
set datacheck parameters to 2, this might detect numerical issues in data
have you tried different LP algorithms? Using the lpmethod parameter you could try primal, dual or barrier algorithm to see whether one runs faster on your problem.
Reference:
https://www.ibm.com/support/knowledgecenter/SSSA5P_12.10.0/ilog.odms.cplex.help/CPLEX/Parameters/topics/LPMETHOD.html
In DOcplex:
model.parameters.datacheck = 2
model.parameters.lpmethod = 4 # for barrier
From your answers, I can think of the following:
if you are in pure LP (is this true?) I see no point in rounding numbers (but yes, that would help in a MIP, try rounding coefficients whose fractional part is say less than 1e-7: 4.0000001 -> 4)
1e+14 conditioning denotes serious modeling issue: a common source is mixing different objectives with coefficients. Have you tried multi-objective to avoid that?
Another source is big_M formulations, to which you should prefer indicator constraints. If you are not in these two cases, then try to renormalize the data to keep in a smaller condition range...
Finally, you might try setting markowitz tolerance to 0.99, to add extra cautiouness in simplex factorizations, but behavior may vary from one dataset to the other...

If condition in Cplex objective function

I am new to Cplex. I'm solving an integer programming problem, but I have a problem with objective function.
Problem is that I have some project, that have a due date D, and if project is tardy, than I have a tardiness penalty b, so it looks like b*(cn-D). Where cn is a real complection time of a project, and it is decision variable.
It must look like this
if (cn-D)>=0 then b*(cn-D)==0
I tried to use "if-then" constraint, but seems that it isn't working with decision variable.
I looked on question similar to this, but but could not find solution. Please, help me define correct objective function.
The standard way to model this is:
min sum(i, penalty(i)*Tardy(i))
Tardy(i) >= CompletionTime(i) - DueDate(i)
Tardy(i) >= 0
Tardy is a non-negative variable and can never become negative. The other quantities are:
penalty: a constant indicating the cost associated with job i being tardy one unit of time.
CompletionTime: a variable that holds the completion time of job i
DueDate: a constant with the due date of job i.
The above measures the sum. Sometimes we also want to measure the count: the number of jobs that are tardy. This is to prevent many jobs being tardy. In the most general case one would have both the sum and the count in the objective with different weights or penalties.
There is a virtually unlimited number of papers showing MIP formulations on scheduling models that involve tardiness. Instead of reinventing the wheel, it may be useful to consult some of them and see what others have done to formulate this.

Another way to calculate double type variables in c++?

Short version of the question: overflow or timeout in current settings when calculating large int64_t and double, anyway to avoid these?
Test case:
If only demand is 80,000,000,000, solved with correct result. But if it's 800,000,000,000, returned incorrect 0.
If input has two or more demands (means more inequalities need to be calculated), smaller value will also cause incorrectness. e.g., three equal demands of 20,000,000,000 will cause the problem.
I'm using COIN-OR CLP linear programming solver to solve some network flow problems. I use int64_t when representing the link bandwidth. But CLP uses double most of time and cannot transfer to other types easily.
When the values of the variables are not that large (typically smaller than 10,000,000,000) and the constraints (inequalities) are relatively few, it will give the solution I want it to. But if either of the above factors increases, the tool will stop and return a 0 value solution. I think the reason is the calculation complexity is over its maximum, so program breaks at some trivial point (it uses LP simplex method).
The inequality is some kind of:
totalFlowSum <= usePercentage * demand
I changed it to
totalFlowSum - usePercentage * demand <= 0
Since totalFLowSum and demand are very large int64_t, usePercentage is double, if the constraints like this are too many (several or even more), or if the demand is larger than 100,000,000,000, the returned solution will be wrong.
Is there any way to correct this, like increase the break threshold or avoid this level of calculation magnitude?
Decrease some accuracy is acceptable. I have a possible solution is that 1,000 times smaller on inputs and 1,000 time larger on outputs. But this is kind of naïve and may cause too much code modification in the program.
Update:
I have changed the formulation to
totalFlowSum / demand - usePercentage <= 0
but the problem still exists.
Update 2:
I divided usePercentage by 1000, making its coefficient from 1 to 0.001, it worked. But if I also divide totalFlowSum/demand by 1000 simultaneously, still no result. I don't know why...
I changed the rhs of equalities from 0 to 0.1, the problem is then solved! Since the inputs are very large, 0.1 offset won't impact the solution at all.
I think the reason is that previous coeffs are badly scaled, so the complier failed to find an exact answer.

Finding an optimal solution to a system of linear equations in c++

Here's the problem:
I am currently trying to create a control system which is required to find a solution to a series of complex linear equations without a unique solution.
My problem arises because there will ever only be six equations, while there may be upwards of 20 unknowns (usually way more than six unknowns). Of course, this will not yield an exact solution through the standard Gaussian elimination or by changing them in a matrix to reduced row echelon form.
However, I think that I may be able to optimize things further and get a more accurate solution because I know that each of the unknowns cannot have a value smaller than zero or greater than one, but it is free to take on any value in between them.
Of course, I am trying to create code that would find a correct solution, but in the case that there are multiple combinations that yield satisfactory results, I would want to minimize Sum of (value of unknown * efficiency constant) over all unknowns, i.e. Sigma[xI*eI] from I=0 to n, but finding an accurate solution is of a greater priority.
Performance is also important, due to the fact that this algorithm may need to be run several times per second.
So, does anyone have any ideas to help me on implementing this?
Edit: You might just want to stick to linear programming with equality and inequality constraints, but here's an interesting exact solution that does not incorporate the constraint that your unknowns are between 0 and 1.
Here's a powerpoint discussing your problem: http://see.stanford.edu/materials/lsoeldsee263/08-min-norm.pdf
I'll translate your problem into math to make things a bit easier to figure out:
you have a 6x20 matrix A and a vector x with 20 elements. You want to minimize (x^T)e subject to Ax=y. According to the slides, if you were just minimizing the sum of x, then the answer is A^T(AA^T)^(-1)y. I'll take another look at this as soon as I get the chance and see what the solution is to minimizing (x^T)e (ie your specific problem).
Edit: I looked in the powerpoint some more and near the end there's a slide entitled "General norm minimization with equality constraints". I am going to switch the notation to match the slide's:
Your problem is that you want to minimize ||Ax-b||, where b = 0 and A is your e vector and x is the 20 unknowns. This is subject to Cx=d. Apparently the answer is:
x=(A^T A)^-1 (A^T b -C^T(C(A^T A)^-1 C^T)^-1 (C(A^T A)^-1 A^Tb - d))
it's not pretty, but it's not as bad as you might think. There's really aren't that many calculations. For example (A^TA)^-1 only needs to be calculated once and then you can reuse the answer. And your matrices aren't that big.
Note that I didn't incorporate the constraint that the elements of x are within [0,1].
It looks like the solution for what I am doing is with Linear Programming. It is starting to come back to me, but if I have other problems I will post them in their own dedicated questions instead of turning this into an encyclopedia.

parallel calculation of infinite series

I just have a quick question, on how to speed up calculations of infinite series.
This is just one of the examples:
arctan(x) = x - x^3/3 + x^5/5 - x^7/7 + ....
Lets say you have some library which allow you to work with big numbers, then first obvious solution would be to start adding/subtracting each element of the sequence until you reach some target N.
You also can pre-save X^n so for each next element instead of calculating x^(n+2) you can do lastX*(x^2)
But over all it seems to be very sequential task, and what can you do to utilize multiple processors (8+)??.
Thanks a lot!
EDIT:
I will need to calculate something from 100k to 1m iterations. This is c++ based application, but I am looking for abstract solution, so it shouldn't matter.
Thanks for reply.
You need to break the problem down to match the number of processors or threads you have. In your case you could have for example one processor working on the even terms and another working on the odd terms. Instead of precalculating x^2 and using lastX*(x^2), you use lastX*(x^4) to skip every other term. To use 8 processors, multiply the previous term by x^16 to skip 8 terms.
P.S. Most of the time when presented with a problem like this, it's worthwhile to look for a more efficient way of calculating the result. Better algorithms beat more horsepower most of the time.
If you're trying to calculate the value of pi to millions of places or something, you first want to pay close attention to choosing a series that converges quickly, and which is amenable to parallellization. Then, if you have enough digits, it will eventually become cost-effective to split them across multiple processors; you will have to find or write a bignum library that can do this.
Note that you can factor out the variables in various ways; e.g.:
atan(x)= x - x^3/3 + x^5/5 - x^7/7 + x^9/9 ...
= x*(1 - x^2*(1/3 - x^2*(1/5 - x^2*(1/7 - x^2*(1/9 ...
Although the second line is more efficient than a naive implementation of the first line, the latter calculation still has a linear chain of dependencies from beginning to end. You can improve your parallellism by combining terms in pairs:
= x*(1-x^2/3) + x^3*(1/5-x^2/7) + x^5*(1/9 ...
= x*( (1-x^2/3) + x^2*((1/5-x^2/7) + x^2*(1/9 ...
= [yet more recursive computation...]
However, this speedup is not as simple as you might think, since the time taken by each computation depends on the precision needed to hold it. In designing your algorithm, you need to take this into account; also, your algebra is intimately involved; i.e., for the above case, you'll get infinitely repeating fractions if you do regular divisions by your constant numbers, so you need to figure some way to deal with that, one way or another.
Well, for this example, you might sum the series (if I've got the brackets in the right places):
(-1)^i * (x^(2i + 1))/(2i + 1)
Then on processor 1 of 8 compute the sum of the terms for i = 1, 9, 17, 25, ...
Then on processor 2 of 8 compute the sum of the terms for i = 2, 11, 18, 26, ...
and so on, finally adding up the partial sums.
Or, you could do as you (nearly) suggest, give i = 1..16 (say) to processor 1, i = 17..32 to processor 2 and so on, and they can compute each successive power of x from the previous one. If you want more than 8x16 elements in the series, then assign more to each processor in the first place.
I doubt whether, for this example, it is worth parallelising at all, I suspect that you will get to double-precision accuracy on 1 processor while the parallel threads are still waking up; but that's just a guess for this example, and you can probably many series for which parallelisation is worth the effort.
And, as #Mark Ransom has already said, a better algorithm ought to beat brute-force and a lot of processors every time.