pulp shadow price difference with gurobi - linear-programming

I am comparing the values for shadow price (pi) calculated with gurobi and pulp. I get different values for the same input and I am not sure how to do it with pulp. Here is the lp file that I use:
Minimize
x[0] + x[1] + x[2] + x[3]
Subject To
C[0]: 7 x[0] >= 211
C[1]: 3 x[1] >= 395
C[2]: 2 x[2] >= 610
C[3]: 2 x[3] >= 97
Bounds
End
For the above lp file, gurobi gives me shadow prices:
[0.14285714285714285, 0.3333333333333333, 0.5, 0.5]
and with pulp I get:
[0.14285714, 0.33333333, 0.5, 0.5]
But If I execute the following lp model:
Minimize
x[0] + x[1] + x[2] + x[3] + x[4]
Subject To
C[0]: 7 x[0] + 2 x[4] >= 211
C[1]: 3 x[1] >= 395
C[2]: 2 x[2] + 2 x[4] >= 610
C[3]: 2 x[3] >= 97
Bounds
End
With gurobi I get:
[0.0, 0.3333333333333333, 0.5, 0.5]
and with pulp I get:
[0.14285714, 0.33333333, 0.5, 0.5]
The correct value is the one that gurobi returns (I think ?).
Why I get the same shadow prices with pulp for different models ? How I can get the same results as gurobi ?
(I did not supply the source code because the question will be too long, I think the lp models are enough)

In the second example, there are two dual solutions that are optimal: the one PuLP gives you, and the one you get by calling Gurobi directly. The unique optimal primal solution is [0.0, 131.67, 199.5, 48.5, 105.5], which makes the slacks for all the constraints are 0 in the optimal primal solution. For c[0] if you reduce the right hand side, you get no reduction in the objective, but if you increase it, the cheapest way to make the constraint feasible is by increasing x[0]. Gurobi only guarantees that you will produce an optimal primal and dual solution. The specific optimal solution you get is arbitrary.
The first example is just a precision issue.

Related

Can you get a list of the powers in a polynomial? Pari GP

I'm working with single variable polynomials with coefficients +1/-1 (and zero). These can be very long and the range of powers can be quite big. It would be convenient for me to view the powers as a vector - is there any way of doing this quickly? I had hoped there would be a command already in Pari to do this, but I can't seem to see one?
Just an example to confirm what I'm trying to do...
Input:x^10 - x^8 + x^5 - x^2 + x + 1
Desired output: [10, 8, 5, 2, 1, 0]
You can use Vecrev to get the polynomial coefficients. After that just enumerate them to select the zero-based positions of non-zeros. You want the following one-liner:
nonzeros(xs) = Vecrev([x[2]-1 | x <- select(x -> x[1] != 0, vector(#xs, i, [xs[i], i]))])
Now you can easily get the list of polynomial powers:
p = x^10 - x^8 + x^5 - x^2 + x + 1
nonzeros(Vecrev(p))
>> [10, 8, 5, 2, 1, 0]

Random lottery numbers from macro to variable

I have the following lottery numbers in a macro:
global lottery 6 9 4 32 98
How can I simulate a variable with 50 observations, where each observation is randomly obtained from the numbers stored in the macro?
The code below produces an error:
set obs 50
global lottery 6 9 4 32 98
g lot=$lottery
invalid '9'
r(198);
Here are two similar methods:
clear
set obs 50
set seed 2803
local lottery 6 9 4 32 98
* method 1
generate x1 = ceil(5 * runiform())
tabulate x1
generate y1 = .
forvalues j = 1/5 {
replace y1 = real(word("`lottery'", `j')) if x1 == `j'
}
* method2
set seed 2803
generate x2 = runiform()
generate y2 = cond(x2 <= 0.2, 6, cond(x2 <= 0.4, 9, cond(x2 <= 0.6, 4, cond(x2 <= 0.8, 32, 98))))
tab1 y?
I am assuming that you want equal probabilities, which is not explicit. The principle of setting the seed is crucial for reproducibility. As in #Pearly Spencer's answer, using a local macro is widely considered (much) better style.
To spell it out: in this answer the probabilities are equal, but the frequencies will fluctuate from sample to sample. In #Pearly’s answer the frequencies are guaranteed equal, given that 50 is a multiple of 5, and randomness is manifest only in the order in which they arrive. This answer is like tossing a die with five faces 50 times; #Pearly’s answer is like drawing from a deck of 50 cards with 5 distinct types.

SoPlex yields wrong answer

I have an LP in CPLEX LP format, in file LP.tmp
Maximize
obj: x0 + 2 x1 + 3 x2 + 4 x3 + 5 x4 + 7 x5
Subject To
c0: 1 x0 + 1 x1 + 1 x2 + 1 x3 + 1 x4 + 1 x5 + 1 x6 + 1 x7 + 1 x8 = 0
End
On this I call soplex -X -x -o0 -f0 -v3 LP.tmp
This is obviously unbounded, but calling Soplex gives me the answer (with some other lines).
SoPlex status : problem is solved [optimal]
Solving time (sec) : 0.00
Iterations : 0
Objective value : 0.00000000e+00
Primal solution (name, value):
All other variables are zero (within 1.0e-16). Solution has 0 nonzero entries.
Background: Originally, I had objective 0, but box constraints, and I always got infeasible. So I reduced everything, until I arrived at the above.
What am I doing wrong?
All variables are non-negative by default in the lp file format, see https://www.ibm.com/support/knowledgecenter/SSSA5P_12.5.0/ilog.odms.cplex.help/CPLEX/FileFormats/topics/LP.html
Therefore, your constraint fixes all variables to 0. As soon as you change the coefficient of any of the variables but x5 to -1 or add a bounds section where you define it to be free, e.g., x1 free, SoPlex claims unboundedness and provides a valid primal ray.
This model is not unbounded. There are implicit bounds of 0 on all variables, so the only feasible and hence optimal solution is the one SoPlex returns.
In the .lp data format, all variables are non-negative by default.

dramatic error in lp_solve?

I've a simple problem that I passed to lp_solve via the IDE (5.5.2.0)
/* Objective function */
max: +r1 +r2;
/* Constraints */
R1: +r1 +r2 <= 4;
R2: +r1 -2 b1 = 0;
R3: +r2 -3 b2 = 0;
/* Variable bounds */
b1 <= 1;
b2 <= 1;
/* Integer definitions */
int b1,b2;
The obvious solution to this problem is 3. SCIP as well as CBC give 3 as answer but not lp_solve. Here I get 2. Is there a major bug in the solver?
Thank's in advance.
I had contact to the developer group that cares about lpsolve software. The error will be fixed in the next version of lpsolve.
When I tried it, I am getting 3 as the optimal value for the Obj function.
Model name: 'LPSolver' - run #1
Objective: Maximize(R0)
SUBMITTED
Model size: 3 constraints, 4 variables, 6 non-zeros.
Sets: 0 GUB, 0 SOS.
Using DUAL simplex for phase 1 and PRIMAL simplex for phase 2.
The primal and dual simplex pricing strategy set to 'Devex'.
Relaxed solution 4 after 4 iter is B&B base.
Feasible solution 2 after 6 iter, 3 nodes (gap 40.0%)
Optimal solution 2 after 7 iter, 4 nodes (gap 40.0%).
Excellent numeric accuracy ||*|| = 0
MEMO: lp_solve version 5.5.2.0 for 32 bit OS, with 64 bit REAL variables.
In the total iteration count 7, 1 (14.3%) were bound flips.
There were 2 refactorizations, 0 triggered by time and 0 by density.
... on average 3.0 major pivots per refactorization.
The largest [LUSOL v2.2.1.0] fact(B) had 8 NZ entries, 1.0x largest basis.
The maximum B&B level was 3, 0.8x MIP order, 3 at the optimal solution.
The constraint matrix inf-norm is 3, with a dynamic range of 3.
Time to load data was 0.001 seconds, presolve used 0.017 seconds,
... 0.007 seconds in simplex solver, in total 0.025 seconds.

Split Data knowing its common ID

I want to split this data,
ID x y
1 2.5 3.5
1 85.1 74.1
2 2.6 3.4
2 86.0 69.8
3 25.8 32.9
3 84.4 68.2
4 2.8 3.2
4 24.1 31.8
4 83.2 67.4
I was able, making match with their partner like,
ID x y ID x y
1 2.5 3.5 1 85.1 74.1
2 2.6 3.4 2 86.0 69.8
3 25.8 32.9
4 24.1 31.8
However, as you notice some of the new row in ID 4 were placed wrong, because it just got added in the next few rows. I want to split them properly without having to use complex logic which I am already using... Someone can give me an algorithm or idea?
it should looks like,
ID x y ID x y ID x y
1 2.5 3.5 1 85.1 74.1 3 25.8 32.9
2 2.6 3.4 2 86.0 69.8 4 24.1 31.8
4 2.8 3.2 3 84.4 68.2
4 83.2 67.4
It seems that your question is really about clustering, and that the ID column has nothing to do with the determining which points correspond to which.
A common algorithm to achieve that would be k-means clustering. However, your question implies that you don't know the number of clusters in advance. This complicates matters, and there have been already a lot of questions asked here on StackOverflow regarding this issue:
Kmeans without knowing the number of clusters?
compute clustersize automatically for kmeans
How do I determine k when using k-means clustering?
How to optimal K in K - Means Algorithm
K-Means Algorithm
Unfortunately, there is no "right" solution for this. Two clusters in one specific problem could be indeed considered as one cluster in another problem. This is why you'll have to decide that for yourself.
Nevertheless, if you're looking for something simple (and probably inaccurate), you can use Euclidean distance as a measure. Compute the distances between points (e.g. using pdist), and group points where the distance falls below a certain threshold.
Example
%// Sample input
A = [1, 2.5, 3.5;
1, 85.1, 74.1;
2, 2.6, 3.4;
2, 86.0, 69.8;
3, 25.8, 32.9;
3, 84.4, 68.2;
4, 2.8, 3.2;
4, 24.1, 31.8;
4, 83.2, 67.4];
%// Cluster points
pairs = nchoosek(1:size(A, 1), 2); %// Rows of pairs
d = sqrt(sum((A(pairs(:, 1), :) - A(pairs(:, 2), :)) .^ 2, 2)); %// d = pdist(A)
thr = d < 10; %// Distances below threshold
kk = 1;
idx = 1:size(A, 1);
C = cell(size(idx)); %// Preallocate memory
while any(idx)
x = unique(pairs(pairs(:, 1) == find(idx, 1) & thr, :));
C{kk} = A(x, :);
idx(x) = 0; %// Remove indices from list
kk = kk + 1;
end
C = C(~cellfun(#isempty, C)); %// Remove empty cells
The result is a cell array C, each cell representing a cluster:
C{1} =
1.0000 2.5000 3.5000
2.0000 2.6000 3.4000
4.0000 2.8000 3.2000
C{2} =
1.0000 85.1000 74.1000
2.0000 86.0000 69.8000
3.0000 84.4000 68.2000
4.0000 83.2000 67.4000
C{3} =
3.0000 25.8000 32.9000
4.0000 24.1000 31.8000
Note that this simple approach has the flaw of restricting the cluster radius to the threshold. However, you wanted a simple solution, so bear in mind that it gets complicated as you add more "clustering logic" to the algorithm.