I am wondering whether any one here uses lpsolve to solve linear programming problem.
I have defined a integer linear programming problem in a file where there is a constraint x45=0(there are also other integer variables). After the problem is solved by lpsolve, the result reported strangely is x45=1. However, if I put a label before the constraint, for example, c1:x45=0, then the constraint will be met. Anyone here know what's going on?
So the problem I defined in my LP file is like:
max: 0 x0 262 x1 262 x2 262 x3 262 x4 262 x5 262 x6 262 x7 270 x8 0 x9 270 x10 270 x11 270 x12 270 x13 270 x14 270 x15;
549 x16 549 x17 0 x18 549 x19 549 x20 549 x21 549 x22 549 x23 >= 1;
603 x24 603 x25 603 x26 0 x27 603 x28 603 x29 603 x30 603 x31 >= 1;
x0=0;
x9=0;
x18=0;
x27=0;
x36=0;
x45=0;
x54=0;
x63=0;
x0=x0; x1=x8; x2=x16; x3=x24; x4=x32; x5=x40; x6=x48; x7=x56; x8=x1; x9=x9; x10=x17; x11=x25; x12=x33; x13=x41; x14=x49; x15=x57; x16=x2; x17=x10; x18=x18; x19=x26; x20=x34; x21=x42; x22=x50; x23=x58; x24=x3; x25=x11; x26=x19; x27=x27; x28=x35; x29=x43; x30=x51; x31=x59; x32=x4; x33=x12; x34=x20; x35=x28; x36=x36; x37=x44; x38=x52; x39=x60; x40=x5; x41=x13; x42=x21; x43=x29; x44=x37; x45=x45; x46=x53; x47=x61; x48=x6; x49=x14; x50=x22; x51=x30; x52=x38; x53=x46; x54=x54; x55=x62; x56=x7; x57=x15; x58=x23; x59=x31; x60=x39; x61=x47; x62=x55; x63=x63; x0 x1 x2 x3 x4 x5 x6 x7=1; x8 x9 x10 x11 x12 x13 x14 x15=1; x16 x17 x18 x19 x20 x21 x22 x23=1; x24 x25 x26 x27 x28 x29 x30 x31=1; x32 x33 x34 x35 x36 x37 x38 x39=1; x40 x41 x42 x43 x44 x45 x46 x47=1; x48 x49 x50 x51 x52 x53 x54 x55=1; x56 x57 x58 x59 x60 x61 x62 x63=1;
bin x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16,x17,x18,x19,x20,x21,x22,x23,x24,x25,x26,x27,x28,x29,x30,x31,x32,x33,x34,x35,x36,x37,x38,x39,x40,x41,x42,x43,x44,x45,x46,x47,x48,x49,x50,x51,x52,x53,x54,x55,x56,x57,x58,x59,x60,x61,x62,x63;
By solving it, the result is:
Value of objective function: 532.00000000
Actual values of the variables:
x0 0
x1 0
x2 0
x3 0
x4 0
x5 0
x6 0
x7 1
x8 0
x9 0
x10 1
x11 0
x12 0
x13 0
x14 0
x15 0
x16 0
x17 1
x18 0
x19 0
x20 0
x21 0
x22 0
x23 0
x24 0
x25 0
x26 0
x27 0
x28 1
x29 0
x30 0
x31 0
x36 0
x45 1
x54 1
x63 0
x32 0
x40 0
x48 0
x56 1
x33 0
x41 0
x49 0
x57 0
x34 0
x42 0
x50 0
x58 0
x35 1
x43 0
x51 0
x59 0
x37 0
x44 0
x38 0
x52 0
x39 0
x60 0
x46 0
x53 0
x47 0
x61 0
x55 0
x62 0
As you can see. x45 and x54 are both 1 while other constraints are all met. If I put a lable before the constraint such as :
c1: x45=0;
Then it will be met. I am not sure why this makes a difference
Updated response based on OP's clarifications
Something interesting is going on. When I solve the exact same IP that you posted above, all the constraints are being met.
Value of objective function: 532.00000000
Actual values of the variables:
x0 0
x1 0
x2 0
x3 0
x45 0
x54 0
x63 0
x32 0
x40 0
x48 1
x56 0
x33 0
x41 0
x49 0
x57 1
The good news is that the Obj function value is the same. Here's what you can try and you will catch what's going on. Your problem is highly degenerate.
Somehow lp_solve is not seeing your x45=0 constraint.
Diagnose using the -stat option
Here's how can see what's going on:
create two LP files.
The original lp (say original.lp)
Another file with the c1: added. (say namedconstraint.lp)
Now try this: From the command line mode
lp_solve -stat original.lp
lp_solve -stat namedconstraint.lp
If you compare the two outputs, you will see what is going on.
In my case, when I run lp_solve -stat I get
Constraints: 74
Variables : 64
Integers : 64
Semi-cont : 0
SOS : 0
Non-zeros : 190 density=4.011824%
Then you can keep tweaking the original.lp file until you see why this is happening.
Further things to try:
Based on your additional remarks below, lp_solve is not seeing the constraint, unless you give the constraint a name. Try this next:
1. Move that constraint to be the very first, or the very last constraint in your model. Does that change anything?
2. I suspect that there are some strange characters in the line (or constraint) prior to the line where you have type x45=0. See if it helps if you delete that line.
Unfortunately, I am unable to replicate the problem, so I cannot debug it myself. Hence these suggestions.
Related
I have an LP in CPLEX LP format, in file LP.tmp
Maximize
obj: x0 + 2 x1 + 3 x2 + 4 x3 + 5 x4 + 7 x5
Subject To
c0: 1 x0 + 1 x1 + 1 x2 + 1 x3 + 1 x4 + 1 x5 + 1 x6 + 1 x7 + 1 x8 = 0
End
On this I call soplex -X -x -o0 -f0 -v3 LP.tmp
This is obviously unbounded, but calling Soplex gives me the answer (with some other lines).
SoPlex status : problem is solved [optimal]
Solving time (sec) : 0.00
Iterations : 0
Objective value : 0.00000000e+00
Primal solution (name, value):
All other variables are zero (within 1.0e-16). Solution has 0 nonzero entries.
Background: Originally, I had objective 0, but box constraints, and I always got infeasible. So I reduced everything, until I arrived at the above.
What am I doing wrong?
All variables are non-negative by default in the lp file format, see https://www.ibm.com/support/knowledgecenter/SSSA5P_12.5.0/ilog.odms.cplex.help/CPLEX/FileFormats/topics/LP.html
Therefore, your constraint fixes all variables to 0. As soon as you change the coefficient of any of the variables but x5 to -1 or add a bounds section where you define it to be free, e.g., x1 free, SoPlex claims unboundedness and provides a valid primal ray.
This model is not unbounded. There are implicit bounds of 0 on all variables, so the only feasible and hence optimal solution is the one SoPlex returns.
In the .lp data format, all variables are non-negative by default.
I'm new to proc optmodel and would appreciate any help to solve the problem at hand.
Here's my problem:
My dataset is like below:
data my data;
input A B C;
cards;
0 240 3
3.4234 253 2
0 258 7
0 272 4
0 318 7
0 248 8
0 260 2
0.2555 305 5
0 314 5
1.7515 235 7
32 234 4
0 301 3
0 293 5
0 302 12
0 234 2
0 258 4
0 289 2
0 287 10
0 313 3
0.7725 240 7
0 268 3
1.4411 286 9
0 234 13
0.0474 318 2
0 315 4
0 292 5
0.4932 272 3
0 288 4
0 268 4
0 284 6
0 270 4
50.9188 293 3
0 272 3
0 284 2
0 307 3
;
run;
There are 3 variables(A,B,C) and I want to classify observations into three classes (H,M,L) based on these 3 variables.
For class H, I want to maximize A, minimize B and C;
For class M, I want to median A,B and C;
For class L, I want to minimize A, maximize B and C.
Also, the constrain is that I want to limit the total observations classified into H less than 5%, and total observations classified into M less than 7%.
The final target is finding the cut-off of A,B,C for classifying obs into three different classes.
Since the three classes are equally weighted,so I scaled the vars first and create a risk var where risk = A+(1-B)+(1-C);
Thanks in advance for any help.
my sas code:
proc stdize data=my_data out=my_data1 method=RANGE;
var A B C;
run;
data new;
set my_data1;
risk = A+(1-B)+(1-C);
run;
proc sort data=new out=range;
by risk;
run;
proc optmodel;
/* read data */
set CUTOFF;
/* str risk_level {CUTOFF}; */
num a {CUTOFF};
num b {CUTOFF};
num c {CUTOFF};
read data my_data1 into CUTOFF=[_n_] a=A b=B c=C;
impvar risk{p in CUTOFF} = a[p]+(1-b[p])+(1-c[p]);
var indh {CUTOFF} binary;
var indmh {CUTOFF} binary;
var indo {CUTOFF} binary;
con sum{p in CUTOFF} indh[p] le 10;
con sum{p in CUTOFF} indmh[p] le 6;
con sum{p in CUTOFF} indo[p] le 19;
con class{p in CUTOFF}:indh[p]+indmh[p]+indo[p] le 1;
max new = sum{p in CUTOFF}(10*indh[p]+4*indmh[p]+indo[p])*risk[p];
solve;
print a b c risk indh indmh indo new;
quit;
So now my problem is how to find the min risk value in each class,Thanks!
I have a training dataframe dfTrain and the output of dfTrain.head() is shown below:
C0 C1 C2 C3 C4 C5 C6
0 1 73 Not in universe 0 0 0 Not in universe
1 2 58 Self-employed-not incorporated 4 34 0 Not in universe
2 3 18 Not in universe 0 0 0 High school
3 4 9 Not in universe 0 0 0 Not in universe
4 5 10 Not in universe 0 0 0 Not in universe
There are total 38 features and they are both categorical and numerical. Ignoring C1 and scaling numerical features, I am trying to build a Logistic Regression model. Since, the dataframe has categorical features, I am creating another dataframe which has dummy variables.
X = pd.get_dummies(dfTrain)
The shape of X now has 160 features which is much more than that of dfTrain.
Then I pass X and y (where y is target variable) to Logistic Regression Classifier
modelLogistic = LogisticRegression(C=10**-2, class_weight = 'balanced')
modelLogistic.fit(X, y)
The reason to use class_weight = 'balanced' is that there are 17 classes in y and highly imbalanced.
My question is: is my approach correct? Am I missing anything?
I am using CGAL QP package to solve the following quadratic problem:
I am using the following MPS file to define the problem (first_qp.mps):
NAME first_qp
ROWS
E c0
COLUMNS
x0 c0 1
x1 c0 1
x2 c0 1
x3 c0 1
x4 c0 1
x5 c0 1
x6 c0 1
x7 c0 1
x8 c0 1
RHS
rhs c0 1
BOUNDS
UP BND x0 0.2
UP BND x1 0.2
UP BND x2 0.2
UP BND x3 0.2
UP BND x4 0.2
UP BND x5 0.2
UP BND x6 0.2
UP BND x7 0.2
UP BND x8 0.2
QUADOBJ
x0 x0 39.07
x1 x0 25.54
x2 x0 27.29
x3 x0 28.56
x4 x0 24.38
x5 x0 10.23
x6 x0 11.12
x7 x0 15.26
x8 x0 25.17
x1 x1 38.82
x2 x1 18.11
x3 x1 20.67
x4 x1 17.20
x5 x1 8.10
x6 x1 12.41
x7 x1 9.82
x8 x1 14.69
x2 x2 39.97
x3 x2 26.82
x4 x2 22.55
x5 x2 12.81
x6 x2 10.90
x7 x2 16.17
x8 x2 26.42
x3 x3 29.00
x4 x3 24.61
x5 x3 10.37
x6 x3 10.65
x7 x3 14.93
x8 x3 23.61
x4 x4 49.71
x5 x4 7.04
x6 x4 6.20
x7 x4 17.41
x8 x4 25.87
x5 x5 12.47
x6 x5 8.21
x7 x5 7.53
x8 x5 9.73
x6 x6 19.02
x7 x6 7.47
x8 x6 7.87
x7 x7 16.04
x8 x7 14.95
x8 x8 28.90
ENDATA
Note that I am using QUADOBJ to define the D matrix. In case of QUADOBJ, only the entries of 2D on or below the diagonal must be specified, entries above the diagonal are deduced from symmetry. I then feed this file to the solver (first_qp_from_mps.cpp):
// example: read quadratic program in MPS format from file
// the QP below is the first quadratic program example in the user manual
#include <iostream>
#include <fstream>
#include <CGAL/basic.h>
#include <CGAL/QP_models.h>
#include <CGAL/QP_functions.h>
// choose exact integral type
#ifdef CGAL_USE_GMP
#include <CGAL/Gmpz.h>
typedef CGAL::Gmpz ET;
#else
#include <CGAL/MP_Float.h>
typedef CGAL::MP_Float ET;
#endif
// program and solution types
typedef CGAL::Quadratic_program_from_mps<int> Program;
typedef CGAL::Quadratic_program_solution<ET> Solution;
int main() {
std::ifstream in ("first_qp.mps");
Program qp(in); // read program from file
assert (qp.is_valid()); // we should have a valid mps file
// solve the program, using ET as the exact type
Solution s = CGAL::solve_quadratic_program(qp, ET());
// output solution
std::cout << s;
return 0;
}
The project compiles and the executable file runs and returns the solution vector (0 1 0 0 0 0 0 0 0) and the value of the objective function is 0. I know this is not correct. The solution vector does not satisfy the upper bound constraint. The objective function evaluated at this solution vector cannot be equal to 0.
Am I making a mistake in specifying the MPS file for my quadratic programming problem, or is there something I need to adjust in the way the solver searches for a solution? Could my problem be related to the exact type that CGAL uses?
For instance, I have tried changing <int> to <double> in the following line
typedef CGAL::Quadratic_program_from_mps<int> Program;
The program compiled, but when I ran the executable the solver returned that no solution was feasible. But I know there is a feasible solution - I have found one using the solver in Excel.
You should indeed use instead of in the Program type. But on top of that, ET should be typedef'd as CGAL::Gmpzf (exact floating point type), and not CGAL::Gmpz (exact integral type).
I want to construct a data frame in an Rcpp function, but when I get it, it doesn't really look like a data frame. I've tried pushing vectors etc. but it leads to the same thing. Consider:
RcppExport SEXP makeDataFrame(SEXP in) {
Rcpp::DataFrame dfin(in);
Rcpp::DataFrame dfout;
for (int i=0;i<dfin.length();i++) {
dfout.push_back(dfin(i));
}
return dfout;
}
in R:
> .Call("makeDataFrame",mtcars,"myPkg")
[[1]]
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
[[2]]
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
[[3]]
[1] 160.0 160.0 108.0 258.0 360.0 225.0 360.0 146.7 140.8 167.6 167.6 275.8
[13] 275.8 275.8 472.0 460.0 440.0 78.7 75.7 71.1 120.1 318.0 304.0 350.0
[25] 400.0 79.0 120.3 95.1 351.0 145.0 301.0 121.0
[[4]]
[1] 110 110 93 110 175 105 245 62 95 123 123 180 180 180 205 215 230 66 52
[20] 65 97 150 150 245 175 66 91 113 264 175 335 109
[[5]]
[1] 3.90 3.90 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 3.92 3.07 3.07 3.07 2.93
[16] 3.00 3.23 4.08 4.93 4.22 3.70 2.76 3.15 3.73 3.08 4.08 4.43 3.77 4.22 3.62
[31] 3.54 4.11
[[6]]
[1] 2.620 2.875 2.320 3.215 3.440 3.460 3.570 3.190 3.150 3.440 3.440 4.070
[13] 3.730 3.780 5.250 5.424 5.345 2.200 1.615 1.835 2.465 3.520 3.435 3.840
[25] 3.845 1.935 2.140 1.513 3.170 2.770 3.570 2.780
[[7]]
[1] 16.46 17.02 18.61 19.44 17.02 20.22 15.84 20.00 22.90 18.30 18.90 17.40
[13] 17.60 18.00 17.98 17.82 17.42 19.47 18.52 19.90 20.01 16.87 17.30 15.41
[25] 17.05 18.90 16.70 16.90 14.50 15.50 14.60 18.60
[[8]]
[1] 0 0 1 1 0 1 0 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 0 0 1 0 1 0 0 0 1
[[9]]
[1] 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1
[[10]]
[1] 4 4 4 3 3 3 3 4 4 4 4 3 3 3 3 3 3 4 4 4 3 3 3 3 3 4 5 5 5 5 5 4
[[11]]
[1] 4 4 1 1 2 1 4 2 2 4 4 3 3 3 4 4 4 1 2 1 1 2 2 4 2 1 2 2 4 6 8 2
Briefly:
DataFrames are indeed just like lists with the added restriction of having to have a common length, so they are best constructed column by column.
The best way is often to look for our unit tests. Her inst/unitTests/runit.DataFrame.R
regroups tests for the DataFrame class.
You also found the .push_back() member function in Rcpp which we added for convenience and analogy with the STL. We do warn that it is not recommended: due to differences with the way R objects are constructed, we essentially always need to do full copies .push_back is not very efficient.
Despite me answering here frequently, the rcpp-devel list a better place for Rcpp questions.
It seems Rcpp can return a proper data.frame, provided you supply the names explicitely. I'm not sure how to adapt this to your example with arbitrary names
mkdf <- '
Rcpp::DataFrame dfin(input);
Rcpp::DataFrame dfout;
for (int i=0;i<dfin.length();i++) {
dfout.push_back(dfin(i));
}
return Rcpp::DataFrame::create( Named("x")= dfout(1), Named("y") = dfout(2));
'
library(inline)
test <- cxxfunction( signature(input="data.frame"),
mkdf, plugin="Rcpp")
test(input=head(iris))
Using the information from #baptiste's answer, this is what finally does give a well formed data frame:
RcppExport SEXP makeDataFrame(SEXP in) {
Rcpp::DataFrame dfin(in);
Rcpp::DataFrame dfout;
Rcpp::CharacterVector namevec;
std::string namestem = "Column Heading ";
for (int i=0;i<2;i++) {
dfout.push_back(dfin(i));
namevec.push_back(namestem+std::string(1,(char)(((int)'a') + i)));
}
dfout.attr("names") = namevec;
Rcpp::DataFrame x;
Rcpp::Language call("as.data.frame",dfout);
x = call.eval();
return x;
}
I think the point remains that this might be inefficient due to push_back (as suggested by #Dirk) and the second Language call evaluation. I looked up the rcpp unitTests, and haven't been able to come up with something better yet. Anybody have any ideas?
Update:
Using #Dirk's suggestions (thanks!), this seems to be a simpler, efficient solution:
RcppExport SEXP makeDataFrame(SEXP in) {
Rcpp::DataFrame dfin(in);
Rcpp::List myList(dfin.length());
Rcpp::CharacterVector namevec;
std::string namestem = "Column Heading ";
for (int i=0;i<dfin.length();i++) {
myList[i] = dfin(i); // adding vectors
namevec.push_back(namestem+std::string(1,(char)(((int)'a') + i))); // making up column names
}
myList.attr("names") = namevec;
Rcpp::DataFrame dfout(myList);
return dfout;
}
I concur with joran. The output of a C function called from within R is a list of all its arguments, both "in" and "out", so each "column" of the dataframe could be represented in the C function call as an argument. Once the result of the C function call is in R, all that remains to be done is to extract those list elements using list indexing and give them the appropriate names.