I am working on a Column Generation algorithm using CPLEX to solve the Reduced Master Problem.
After adding the new variables to the RMP, I set their upper bounds to 0, solve the RMP again and retrieve their reduced costs (to check if the value I calculated matches the one provided by CPLEX).
In the first iterations, the reduced costs match.
However, after some iterations, I start getting different reduced cost.
When I run CPLEX Interative Optimizer, read the LP model (or MPS) and compare the duals of the constraints, I get some different values.
Does it make any sense?
I've tried using different methods for solving my LP. Also tried changing tolerances.
Problem stats
Objective sense : Minimize
Variables : 453308 [Fix: 8, Box: 453300]
Objective nonzeros : 6545
Linear constraints : 578166 [Less: 70814, Greater: 503886, Equal: 3466]
Nonzeros : 2710194
RHS nonzeros : 7986
Variables : Min LB: 0.0000000 Max UB: 74868.86
Objective nonzeros : Min : 0.01000000 Max : 10000.00
Linear constraints :
Nonzeros : Min : 0.004000000 Max : 396.8800
RHS nonzeros : Min : 0.01250000 Max : 74868.86
Displaying the solution quality I get these info:
Max. unscaled (scaled) bound infeas. = 8.52651e-014 (3.33067e-015)
Max. unscaled (scaled) reduced-cost infeas. = 2.24935e-010 (5.62339e-011)
Max. unscaled (scaled) Ax-b resid. = 5.90461e-011 (3.69038e-012)
Max. unscaled (scaled) c-B'pi resid. = 2.6489e-011 (7.27596e-012)
Max. unscaled (scaled) |x| = 45433 (2839.56)
Max. unscaled (scaled) |slack| = 4970.49 (80.1926)
Max. unscaled (scaled) |pi| = 295000 (206312)
Max. unscaled (scaled) |red-cost| = 411845 (330962)
Condition number of scaled basis = 1.1e+008
As mentioned in the comment by Erwin, what you are experiencing is probably degeneracy.
Both the primal and dual solutions are often not unique in problems larger than a toy model.
By fixing a set of primal variables to their optimal level, assuming the solution was otherwise primal-dual optimal and the solution is stored in CPLEX, then it should take zero iterations reoptimizing the model after applying the fixes. Hence it should return the same solution. But if no solution is stored in CPLEX and you reoptimize from scratch, then CPLEX may return a different (but also optimal) (primal and/or dual) solution.
Do you see iterations in the log ?
As debug, try to write out the model before fixing and after, then do a diff on these two files to make sure there's not a modeling/programming mistake on your side.
You are also welcome to contact me at bo.jensen (at) dk (dot) ibm (dot) com and I will try to help you as I don't follow stack overflow closely.
My guess would be that when you are setting up the subproblem you fail to account for the reduced cost of variables out of basis at their upper bound. Those reduced costs are essentially the dual values of the upper bound constraint and hence must be taken into account when setting up the subproblem.
This sort of accidental omission typically happens when the generated variables are created with an upper bound.
If really this is your problem, then the your easiest solution may be simply not specifying upper bounds for the new variables, which you can do if the upper bound is implied (e.g., from the new variable being part of a clique constraint).
Related
Detailed business problem:
I'm trying to solve a production scheduling business problem as below:
I have two plants producing FG A and B respectively.
Both the products consume the same Raw Material x
I need to create a 30 day production schedule looking at the Raw Material availability.
FG A and B can be produced if there is sufficient raw material available on the day.
After every 6 days of production the plant has to undergo maintenance and the production on that day will be zero.
Objective is to maximize the margin looking at the day level Raw material available and adhere to the production constraint (i.e. shutdown after every 6th day)
I need to build a linear programming to address the below problem:
Variable y: (binary)
variable z: cumulative of y
When z > 6 then y = 0. I also need to reset the cumulation of z after this point.
Desired output:
How can I build the statement to MILP constraint. Are there any techniques for solving this problem. Thank you.
I think you can model your maintenance differently. Just forbid any sequences of 7 ones for y. I.e.
y[t-6]+y[t-5]+y[t-4]+y[t-3]+y[t-2]+y[t-1]+y[t] <= 6 for t=1,..,T
This is easier than using your accumulator. Note that the beginning needs some attention: you can use historic data for this. I.e., at t=1, the values for t=0,-1,-2,.. are known.
Your accumulator approach is not inherently wrong. We often use it to model inventory. An inventory capacity is a restriction on how large the accumulated inventory can be.
Gurobi 9.0.0 // C++
I am trying to get the shadow price of variables without explicitly generating a constraint for them.
I am generating variables the following way:
GRBModel* milp_model
milp_model->addVar(lb, up, type, 0, name)
Now I would like to get the shadow price (dual) for these variables.
I found this article which says that for "a linear program with lower and upper bounds on a variable, i.e., l ≤ x ≤ u" [...] "Gurobi gives access to the reduced cost of x, which corresponds to sl+su".
To get the shadow price of a constraint one would use the GRB functions according to the following answer (python but same idea) using the Pi constraint attribute.
What would be the GRB function that returns the previously mentioned reduced cost of x / shadow price of a variable?
I tried gurobi_var.get(GRB_DoubleAttr_Pi) which works for gurobi_constr.get(GRB_DoubleAttr_Pi)
but it returns: Not right attribute. Error code = 10003
Can anyone help me with this?
I suppose you are referring to the reduced costs of the variables. You can get them via the variable attribute RC as explained here. And then you need to figure out whether these dual values are corresponding to the upper or lower bound as discussed here.
Usecase : Selecting the "optimal threshold" for a Logistic model built with statsmodel's Logit to predict say, binary classes (or multinomial, but integer classes)
To select your threshold for a (say,logistic) model in Python, is there something that's inbuilt ? For small data sets, I remember, optimizing for the "threshold", by picking up the maximum buckets of true predicted labels (true "0" and true "1") , best seen from the graph here -
http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test
I also know intuitively that if I set alpha values, it should give me a "threshold" that I can use below. How should I compute the threshold given a reduced model with variables, all of which are significant at 95% confidence ? Obviously setting the threshold >0.5 ->"1" would be too naive & since I am looking at 95% confidence this threshold should be "smaller" , meaning p >0.2 or something.
This would then mean something like range of "critical values" if the label should be "1" and "0" otherwise.
What I want is something like this -:
test_scores = smf.Logit(y_train,x_train,missing='drop').fit()
threshold =0.2
#test_scores.predict(x_train,transform=False) will give the continues probability class, so to transform it into labels, I need to compare it against a threshold, (or x_test if I am testing the model)
y_predicted_train = np.array(test_scores.predict(x_train,transform=False) > threshold, dtype=float)
table = np.histogram2d(y_train, y_predicted_train, bins=2)[0]
# will do the similar on "test" data
# crude way of selecting an optimal threshold
from scipy.stats import ks_2samp
import numpy as np
ks_2samp(y_train, y_predicted_train)
(0.39963996399639962, 0.958989)
# must get <95 % here & keep modifying the threshold as above till I fail to reject the Null at 95%
# where y_train is REAL values & y_predicted back on the TRAIN dataset . Note that to get y_predicted (as binary, I already did the thresholding as above
Question :-
1. How can I select the threshold in an objective way - ie reduce the percentage of misclassified labels (say I care more for missing "1" (true positives), but not so much if I mispredict a "0" as "1" ( false negatives) & try to reduce this error. This I get from ROC curve . The roc curve in statsmodels(roc_curve) assumes that I have done the labelling for y_predicted class, and I am just revalidating this over test ( point me if my understanding is incorrect). I also think, using the confusion matrix also will not solve picking up the threshold problem
2. Which bring me to - How should I consume the output of these inbuilt functions (oob , confusion_matrix) to suit for selecting the optimal threshold (first on train sample, & then fine tune it over Test & cross validation sample)
I also looked up the official documentation of K-S tests in scipy here-
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kstest.html#scipy.stats.kstest
Related -:
Statistics Tests (Kolmogorov and T-test) with Python and Rpy2
I am trying to estimate a maximum likelihood model and it is running into convergence problems in Stata. The actual model is quite complicated, but it converges with no troubles in R when it is supplied with appropriate starting values. I however cannot seem to get Stata to accept the starting values I provide.
I have included a simple example below estimating the mean of a poisson distribution. This is not the actual model I am trying to estimate, but it demonstrates my problem. I set the trace variable, which allows you to see the parameters as Stata searches the likelihood surface.
Although I use init to set a starting value of 0.5, the first iteration still shows that Stata is trying a coefficient of 4.
Why is this? How can I force the estimation procedure to use my starting values?
Thanks!
generate y = rpoisson(4)
capture program drop mypoisson
program define mypoisson
args lnf mu
quietly replace `lnf' = $ML_y1*ln(`mu') - `mu' - lnfactorial($ML_y1)
end
ml model lf mypoisson (mean:y=)
ml init 0.5, copy
ml maximize, iterations(2) trace
Output:
Iteration 0:
Parameter vector:
mean:
_cons
r1 4
Added: Stata doesn't ignore the initial value. If you look at the output of the ml maximize command, the first line in the listing will be titled
initial: log likelihood =
Following the equal sign is the value of the likelihood for the parameter value set in the init statement.
I don't know how the search(off) or search(norescale) solutions affect the subsequent likelihood calculations, so these solution might still be worthwhile.
Original "solutions":
To force a start at your initial value, add the search(off) option to ml maximize:
ml maximize, iterate(2) trace search(off)
You can also force a use of the initial value with search(norescale). See Jeff Pitblado's post at http://www.stata.com/statalist/archive/2006-07/msg00499.html.
There are two parameters while using RBF kernels with Support Vector Machines: C and γ. It is not known beforehand which C and γ are the best for one problem; consequently some kind of model selection (parameter search) must be done. The goal is to identify good (C;γ) so that the classier can accurately predict unknown data (i.e., testing data).
weka.classifiers.meta.GridSearch is a meta-classifier for tuning a pair of parameters. It seems, however, that it takes ages to finish (when the dataset is rather large). What would you suggest to do in order to bring down the time required to accomplish this task?
According to A User's Guide to Support Vector Machines:
C : soft-margin constant . A smaller value of C allows to ignore points close to the boundary, and increases the margin.
γ> 0 is a parameter that controls the width of Gaussian
Hastie et al.'s SVMPath explores the entire regularization path for C and only requires about the same computational cost of training a single SVM model. From their paper:
Our R function SvmPath computes all 632 steps in the mixture example (n+ = n− =
100, radial kernel, γ = 1) in 1.44(0.02) secs on a pentium 4, 2Ghz linux machine; the svm
function (using the optimized code libsvm, from the R library e1071) takes 9.28(0.06)
seconds to compute the solution at 10 points along the path. Hence it takes our procedure
about 50% more time to compute the entire path, than it costs libsvm to compute a typical
single solution.
They released a GPLed implementation of the algorithm in R that you can download from CRAN here.
Using SVMPath should allow you to find a good C value for any given γ quickly. However, you would still need to do separate training runs for different γ values. But, this should be much faster than doing separate runs for each pair of C:γ values.