How to find which constraint is violated from pyomo's ipopt interface? - python-2.7

I am running an optimization problem using pyomo's ipopt solver. My problem is sort of complicated, and it is declared infeasible by IPOPT. I will not post the entire problem unless needed. But, one thing to note is, I am providing a warm start for the problem, which I thought would help prevent infeasibility from rearing its ugly head.
Here's the output from pyomo and ipopt when I set tee=True inside of the solver:
Ipopt 3.12.4:
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************
This is Ipopt version 3.12.4, running with linear solver mumps.
NOTE: Other linear solvers might be more efficient (see Ipopt documentation).
Number of nonzeros in equality constraint Jacobian...: 104
Number of nonzeros in inequality constraint Jacobian.: 0
Number of nonzeros in Lagrangian Hessian.............: 57
Total number of variables............................: 31
variables with only lower bounds: 0
variables with lower and upper bounds: 0
variables with only upper bounds: 0
Total number of equality constraints.................: 29
Total number of inequality constraints...............: 0
inequality constraints with only lower bounds: 0
inequality constraints with lower and upper bounds: 0
inequality constraints with only upper bounds: 0
iter objective inf_pr inf_du lg(mu) ||d|| lg(rg) alpha_du alpha_pr ls
0 0.0000000e+00 1.00e+01 1.00e+02 -1.0 0.00e+00 - 0.00e+00 0.00e+00 0
WARNING: Problem in step computation; switching to emergency mode.
1r 0.0000000e+00 1.00e+01 9.99e+02 1.0 0.00e+00 20.0 0.00e+00 0.00e+00R 1
WARNING: Problem in step computation; switching to emergency mode.
Restoration phase is called at point that is almost feasible,
with constraint violation 0.000000e+00. Abort.
Restoration phase in the restoration phase failed.
Number of Iterations....: 1
(scaled) (unscaled)
Objective...............: 0.0000000000000000e+00 0.0000000000000000e+00
Dual infeasibility......: 9.9999999999999986e+01 6.0938999999999976e+02
Constraint violation....: 1.0000000000000000e+01 1.0000000000000000e+01
Complementarity.........: 0.0000000000000000e+00 0.0000000000000000e+00
Overall NLP error.......: 9.9999999999999986e+01 6.0938999999999976e+02
Number of objective function evaluations = 2
Number of objective gradient evaluations = 2
Number of equality constraint evaluations = 2
Number of inequality constraint evaluations = 0
Number of equality constraint Jacobian evaluations = 2
Number of inequality constraint Jacobian evaluations = 0
Number of Lagrangian Hessian evaluations = 2
Total CPU secs in IPOPT (w/o function evaluations) = 0.008
Total CPU secs in NLP function evaluations = 0.000
EXIT: Restoration Failed!
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
model, tee=True)
4
/Library/<path to solvers.pyc> in solve(self, *args, **kwds)
616 result,
617 select=self._select_index,
--> 618 default_variable_value=self._default_variable_value)
619 result._smap_id = None
620 result.solution.clear()
/Library/Frameworks<path to>/PyomoModel.pyc in load_from(self, results, allow_consistent_values_for_fixed_vars, comparison_tolerance_for_fixed_vars, ignore_invalid_labels, id, delete_symbol_map, clear, default_variable_value, select, ignore_fixed_vars)
239 else:
240 raise ValueError("Cannot load a SolverResults object "
--> 241 "with bad status: %s" % str(results.solver.status))
242 if clear:
243 #
ValueError: Cannot load a SolverResults object with bad status: error
You can actually see from the log outputted above, that there were only 2 constraint evaluates from this line:
Number of equality constraint evaluations = 2
So, it actually was declared infeasible pretty quickly, so I imagine it won't be difficult to figure out which constraint was violated.
How do I find out which constraint was violated? Or which constraint is making it infeasible?
Here is a different question, but one that still is informative about IPOPT: IPOPT options for reducing constraint violation after fewer iterations

Running Ipopt with option print_level set to 8
gives me output like
DenseVector "modified d_L scaled" with 1 elements:
modified d_L scaled[ 1]= 2.4999999750000001e+01
DenseVector "modified d_U scaled" with 0 elements:
...
DenseVector "curr_c" with 1 elements:
curr_c[ 1]= 7.1997853012817359e-08
DenseVector "curr_d" with 1 elements:
curr_d[ 1]= 2.4999999473733212e+01
DenseVector "curr_d - curr_s" with 1 elements:
curr_d - curr_s[ 1]=-2.8774855209690031e-07
curr_c are the activity of equality constraints (seen as c(x)=0 internally for Ipopt), curr_d are the activites of inequality constraints (seen as d_L <= d(x) <= d_U internally).
So absolute values of curr_c are violations of equality constraints and max(d_L-curr_d,curr_d-d_U,0) are violations of inequality constraints.
The last iterate including constraint activites is also returned by Ipopt and may be passed back to Pyomo, so you can just compare these values with the left- and right-hand-side of your constraints.

Related

Trajectory Analysis (SAS): Incorrect number of start values

I am attempting a trajectory analysis in SAS (proc traj).
Following instructions found online, I first begin by testing two quadratic models, then three, then four (i.e., order 2 2, order 2 2 2, order 2 2 2 2, order 2 2 2 2 2).
I determined that a three-group linear model is the best fit (order 1 1 1;)
I then wish to add time stable covariates with the risk command. As found online, I did this by adding the start parameters provided in the Log.
At this point, I receive a notice: "Incorrect number of start values. There should be 10 start values based on the model specifications.").
I understand that it's possible to delete some of the 12 parameter estimates provided - But how do I select which ones to remove?
Thank you.
Code:
proc traj data=followupyes outplot=op outstat=os out=of outest=oe itdetail;
id youthid;
title3 'linear 3-gp model ';
var pronoun_allpar1-pronoun_allpar3;
indep time1-time3;
model logit;
ngroups 3;
order 1 1 1;
weight wgt_00;
start 0.031547 0.499724 1.969017 0.859566 -1.236747 0.007471
0.771878 0.495458 0.000000 0.000000 0.000000 0.000000;
risk P00_45_1;
run;
%trajplot (OP, OS, "linear 3-gp model ", "Traj of Pronoun Support", "Pron Support", "Time");
Because you are estimating a model with 3 linear trajectories, you will need 2 start values for each of your 3 groups.
See here for more info: https://www.andrew.cmu.edu/user/bjones/example.htm

"Unknown Label type" decision tree classifier with floats

I want to use a decision tree to predict the value of a float based on 6 features that are also float values. I realise that a decision tree may not be the best method, but I am comparing multiple methods to try and understand them better
The error I am getting is "Unknown label type" on my y training data list. I have read that "DecisionTreeClassifier" accepts float values, and that typically the values are converted to float 32 anyway. I am explicit setting the values in my list to float32 yet there still seems to be a problem, can anybody help?
sample of my x training data (features_x_train) :
[[ 2.49496743e-01 6.07936502e-01 -4.20752168e-01 -3.88045199e-02
-7.59323120e-01 -7.59323120e-01]
[ 4.07418489e-01 5.36915325e-02 2.95270741e-01 1.87122121e-01
9.89770174e-01 9.89770174e-01]]
sample of my y training data (predict_y_train): [ -7.59323120e-01 9.89770174e-01]
Code...
df_train = wellbeing_df[feature_cols].sample(frac=0.9)
#Split columns into predictor and result
features_x_train =
np.array(df_train[list(top_features_cols)].values).astype(np.float32)
predict_y_train = np.asarray(df_train['Happiness score'], dtype=np.float32)
#Setup decision tree
decision_tree = tree.DecisionTreeClassifier()
decision_tree = decision_tree.fit(features_x_train, predict_y_train)
#Train tree on 90% of available data
error:
ValueError Traceback (most recent call last)
<ipython-input-103-a44a03982bdb> in <module>()
19 #Setup decision tree
20 decision_tree = tree.DecisionTreeClassifier()
---> 21 decision_tree = decision_tree.fit(features_x_train, predict_y_train) #Train tree on 90% of available data
22
23 #Test on remaining 10%
C:\Users\User\Anaconda2\lib\site-packages\sklearn\tree\tree.pyc in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
175
176 if is_classification:
--> 177 check_classification_targets(y)
178 y = np.copy(y)
179
C:\Users\User\Anaconda2\lib\site-packages\sklearn\utils\multiclass.pyc in check_classification_targets(y)
171 if y_type not in ['binary', 'multiclass', 'multiclass-multioutput',
172 'multilabel-indicator', 'multilabel-sequences']:
--> 173 raise ValueError("Unknown label type: %r" % y)
174
175
ValueError: Unknown label type: array([[ -7.59323120e-01],
[ 9.89770174e-01],
Also If I change the list to string values then the code runs
Decision Tree Classifier, is, well... a classifier. Classifier is an estimator of function from some arbitrary space (usually R^d) into finite space of values, called label space. Consequently python (scikit-learn) expects you to pass something that is label-like, thus: integer, string, etc. floats are not a typical encoding form of finite space, they are used for regression.
Thus is short you seem to be confusing classification and regression. How to distinguish?
if you have y as floats, but only a finite number of different values can be obtained, and all of them are obtained in training set, then this is classification - just convert your values to strings or integers and you are good to go.
if you have y as floats, and this are actuall real values, and you can have plenty of values, even not seen in the training set and you expect your model to somehow "interpolate" this is regression and you are supposed to use DecisionTreeRegressor instead.
use sklearn.tree.DecisionTreeRegressor()

How can we use clustering results in weka ?

I am using Weka for my internship but I have a little knowledge about data mining. So, maybe someone knows how can I apply the following results on my data-sets to get all data by cluster ? The method that I use now is to compute distances between my attributes and the mean value of each cluster then I classify them by the nearest value. But this method is too rough for me .
=== Run information ===
Scheme:weka.clusterers.EM -I 100 -N -1 -M 1.0E-6 -S 100
Relation: wcet_cluster6 - Copie-weka.filters.unsupervised.attribute.Remove-R1-3,5-weka.filters.unsupervised.attribute.Remove-R5-12
Instances: 467
Attributes: 4
max
alt
stmt
bb
Test mode:evaluate on training data
=== Model and evaluation on training set ===
EM
Number of clusters selected by cross validation: 6
Cluster
Attribute 0 1 2 3 4 5
(0.28) (0.11) (0.25) (0.16) (0.04) (0.17)
==================================================================
max
mean 9.0148 10.9112 11.2826 10.4329 11.2039 10.0546
std. dev. 1.8418 2.7775 3.0263 2.5743 2.2014 2.4614
alt
mean 0.0003 19.6467 0.4867 2.4565 44.191 8.0635
std. dev. 0.0175 5.7685 0.5034 1.3647 10.4761 3.3021
stmt
mean 0.7295 77.0348 3.2439 12.3971 140.9367 33.9686
std. dev. 1.0174 21.5897 2.3642 5.1584 34.8366 11.5868
bb
mean 0.4362 53.9947 1.4895 7.2547 114.7113 22.2687
std. dev. 0.5153 13.1614 0.9276 3.5122 28.0919 7.6968
Time taken to build model (full training data) : 4.24 seconds
=== Model and evaluation on training set ===
Clustered Instances
0 163 ( 35%)
1 50 ( 11%)
2 85 ( 18%)
3 73 ( 16%)
4 18 ( 4%)
5 78 ( 17%)
Log likelihood: -9.09081
Thanks for your help!!
I think no-one can really answer this. Some tips off the top of my head.
You have used the EM clustering algorithm, see animated gif on wikipedia page. From Weka's Documentation Synopsis:
"EM assigns a probability distribution to each instance which
indicates the probability of it belonging to each of the clusters. "
Is this complex output really what you want?
It also selects a number of clusters for you (unless you constrain that number).
In weka 3.7 you can use the unsupervised attribute filter "ClusterMembership" in the Preprocess dialog to replace your dataset with a result of the cluster assignments. You need to select one reference attribute, though. By default it selects the last one. This creates hard-to -interpret output.

Pandas Interpolate Returning NaNs

I'm trying to do basic interpolation of position data at 60hz (~16ms) intervals. When I try to use pandas 0.14 interpolation over the dataframe, it tells me I only have NaNs in my data set (not true). When I try to run it over individual series pulled from the dataframe, it returns the same series without the NaNs filled in. I've tried setting the indices to integers, using different methods, fiddling with the axis and limit parameters of the interpolation function - no dice. What am I doing wrong?
df.head(5) :
x y ms
0 20.5815 14.1821 333.3333
1 NaN NaN 350
2 20.6112 14.2013 366.6667
3 NaN NaN 383.3333
4 20.5349 14.2232 400
df = df.set_index(df.ms) # set indices to milliseconds
When I try running
df.interpolate(method='values')
I get this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-462-cb0f1f01eb84> in <module>()
12
13
---> 14 df.interpolate(method='values')
15
16
/Users/jsb/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in interpolate(self, method, axis, limit, inplace, downcast, **kwargs)
2511
2512 if self._data.get_dtype_counts().get('object') == len(self.T):
-> 2513 raise TypeError("Cannot interpolate with all NaNs.")
2514
2515 # create/use the index
TypeError: Cannot interpolate with all NaNs.
I've also tried running over individual series, which only return what I put in:
temp = df.x
temp.interpolate(method='values')
333.333333 20.5815
350.000000 NaN
366.666667 20.6112
383.333333 NaN
400.000000 20.5349 Name: x, dtype: object
EDIT :
Props to Jeff for inspiring the solution.
Adding:
df[['x','y','ms']] = df[['x','y','ms']].astype(float)
before
df.interpolate(method='values')
interpolation did the trick.
Based on your edit with props to Jeff for inspiring the solution.
Adding:
df = df.astype(float)
before
df.interpolate(method='values')
interpolation did the trick for me as well. Unless you're sub-selecting a column set, you don't need to specify the columns.
I'm not able to to reproduce the error (see below for a copy/paste-able example), can you make sure the the data you show is actually representative of your data?
In [137]: from StringIO import StringIO
In [138]: df = pd.read_csv(StringIO(""" x y ms
...: 0 20.5815 14.1821 333.3333
...: 1 NaN NaN 350
...: 2 20.6112 14.2013 366.6667
...: 3 NaN NaN 383.3333
...: 4 20.5349 14.2232 400"""), delim_whitespace=True)
In [140]: df = df.set_index(df.ms)
In [142]: df.interpolate(method='values')
Out[142]:
x y ms
ms
333.3333 20.58150 14.18210 333.3333
350.0000 20.59635 14.19170 350.0000
366.6667 20.61120 14.20130 366.6667
383.3333 20.57305 14.21225 383.3333
400.0000 20.53490 14.22320 400.0000

Computation of Kullback-Leibler (KL) distance between text-documents using numpy

My goal is to compute the KL distance between the following text documents:
1)The boy is having a lad relationship
2)The boy is having a boy relationship
3)It is a lovely day in NY
I first of all vectorised the documents in order to easily apply numpy
1)[1,1,1,1,1,1,1]
2)[1,2,1,1,1,2,1]
3)[1,1,1,1,1,1,1]
I then applied the following code for computing KL distance between the texts:
import numpy as np
import math
from math import log
v=[[1,1,1,1,1,1,1],[1,2,1,1,1,2,1],[1,1,1,1,1,1,1]]
c=v[0]
def kl(p, q):
p = np.asarray(p, dtype=np.float)
q = np.asarray(q, dtype=np.float)
return np.sum(np.where(p != 0,(p-q) * np.log10(p / q), 0))
for x in v:
KL=kl(x,c)
print KL
Here is the result of the above code: [0.0, 0.602059991328, 0.0].
Texts 1 and 3 are completely different, but the distance between them is 0, while texts 1 and 2, which are highly related has a distance of 0.602059991328. This isn't accurate.
Does anyone has an idea of what I'm not doing right with regards to KL? Many thanks for your suggestions.
Though I hate to add another answer, there are two points here. First, as Jaime pointed out in the comments, KL divergence (or distance - they are, according to the following documentation, the same) is designed to measure the difference between probability distributions. This means basically that what you pass to the function should be two array-likes, the elements of each of which sum to 1.
Second, scipy apparently does implement this, with a naming scheme more related to the field of information theory. The function is "entropy":
scipy.stats.entropy(pk, qk=None, base=None)
http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.stats.entropy.html
From the docs:
If qk is not None, then compute a relative entropy (also known as
Kullback-Leibler divergence or Kullback-Leibler distance) S = sum(pk *
log(pk / qk), axis=0).
The bonus of this function as well is that it will normalize the vectors you pass it if they do not sum to 1 (though this means you have to be careful with the arrays you pass - ie, how they are constructed from data).
Hope this helps, and at least a library provides it so don't have to code your own.
After a bit of googling to undersand the KL concept, I think that your problem is due to the vectorization : you're comparing the number of appearance of different words. You should either link your column indice to one word, or use a dictionnary:
# The boy is having a lad relationship It lovely day in NY
1)[1 1 1 1 1 1 1 0 0 0 0 0]
2)[1 2 1 1 1 0 1 0 0 0 0 0]
3)[0 0 1 0 1 0 0 1 1 1 1 1]
Then you can use your kl function.
To automatically vectorize to a dictionnary, see How to count the frequency of the elements in a list? (collections.Counter is exactly what you need). Then you can loop over the union of the keys of the dictionaries to compute the KL distance.
A potential issue might be in your NP definition of KL. Read the wikipedia page for formula: http://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence
Note that you multiply (p-q) by the log result. In accordance with the KL formula, this should only be p:
return np.sum(np.where(p != 0,(p) * np.log10(p / q), 0))
That may help...