Baron Error with Pyomo: NonLinearity Error in POW expression - pyomo

I am trying to solve a nonlinear optimization problem in Pyomo with Baron.
The solving works fine when using solvers like ipopt, bonmin, couenne etc.
When using Baron I get the following error:
===========================================================================
BARON version 18.5.9. Built: WIN-64 Wed May 9 22:52:08 EDT 2018
BARON is a product of The Optimization Firm, LLC. http://www.minlp.com/
BARON: NonLinearity Error in POW expression
ERROR: Solver (baron) returned non-zero return code (9)
ERROR: See the solver log above for diagnostic information.
Traceback (most recent call last):
File "C:/Users/public.THREADRIPPER/Desktop/CGAM_SWP.py", line 602, in <module>
Results = opt.solve(comp, tee=True, keepfiles=True)
File "C:\Anaconda3\lib\site-packages\pyomo\opt\base\solvers.py", line 626, in solve
"Solver (%s) did not exit normally" % self.name)
pyutilib.common._exceptions.ApplicationError: Solver (baron) did not exit normally
Process finished with exit code 1
Any idea where the problem is?
Thanks!

I'm not sure how urgent the matter still is but I experienced the same issue as you. Actually, in the BARON.py file there is a note regarding this error message here (hope you can see my screenshot, for some reason I am not allowed to upload pictures).
Anyway, I experience the problem in a different way than noted in the BARON.py file. In my case I am using the pyomo.core.base.symbolic.differentiate() function to create first derivates of functions which are to the power of (1/3) which then result in negative exponents which baron apparently has issues with. If I manually exchange the negative exponents with division, it seems to work.
Hope this helps you.
UPDATE: I have to update my comment because it is not negative exponents (per se) which in my case caused the error message of interest ("NonLinearity Error in POW expression", Solver (baron) returned non-zero return code (9)) but it is the way I had introduced them in the input file. In the baron manual there is an example of the case (see screenshot) I am referring to.
So in my case the error message was caused because I had constraints defined like this:
... * ( -241.5 + 0.5 * x12 ) ^ -1.0 * ( x1 - x6 ) ^ -1.0 * ( -483.0 + x12 + x1 - x6 ) ^ -1.0 * ...
which are according to the baron manual misinterpreted by the solver compared to my intention. Actually, in my case a x^y situation was created with nested negative exponential expressions which probably could be solved by baron if it is introduced differently (see next screenshot). However, if I include parentheses around the exponents, to get the expressions I am aiming for, i.e.:
... * ( -241.5 + 0.5 * x12 ) ^ (-1.0) * ( x1 - x6 ) ^ (-1.0) * ( -483.0 + x12 + x1 - x6 ) ^ (-1.0) * ...
it works perfectly fine.

Related

Sympy derivative with a non-symbol

For a project i am working on i need the derivative of a function against wrt cos(theta) but when using Sympy v1.5.1 get an error message stating non-symbols cannot be used as a derivative. This was no problem up to Sympy v1.3 but later versions give this error.
>>> l=1
>>> theta = symbols('theta')
>>> eq=diff((cos(theta)**2-1)**l,cos(theta),l)
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/base/data/home/apps/s~sympy-live-
hrd/20200105t193609.423659059328302322/sympy/sympy/core/function.py", line 2446, in diff
return f.diff(*symbols, **kwargs)
File "/base/data/home/apps/s~sympy-live-
hrd/20200105t193609.423659059328302322/sympy/sympy/core/expr.py", line 3352, in diff
return Derivative(self, *symbols, **assumptions)
File "/base/data/home/apps/s~sympy-live-
hrd/20200105t193609.423659059328302322/sympy/sympy/core/function.py", line 1343, in __new__
__)))
ValueError:
Can't calculate derivative wrt cos(theta).
According to the Sympy documentation (https://docs.sympy.org/latest/modules/core.html#sympy.core.function.Derivative) i may be able to solve this using:
>>> from sympy.abc import t
>>> F = Function('F')
>>> U = f(t)
>>> V = U.diff(t)
>>> direct = F(t, U, V).diff(U)
Unfortunately i can't get this to work with this equation in Sympy v1.5.1.
Suggestions/help are much appreciated.
derivative of a function against wrt cos(theta)
Did this really work before in sympy? i.e. you were able to differentiate w.r.t cos(theta)? This should not work as differentiation is w.r.t to a symbol. For example Maple also gives error
diff( 1+cos(theta)^2,cos(theta))
Error, invalid input: diff received cos(theta), which is not valid for its 2nd argument
Strange that Mathematica does allow this. But I think this is not good behavior. May be that is why sympy no longer allows it.
But you can do this in sympy
from sympy import *
theta,x = symbols('theta x')
eq = (cos(theta)**2-1)**2
result = diff( eq.subs(cos(theta),x) ,x)
result.subs(x,cos(theta))
Which gives
4*(cos(theta)**2 - 1)*cos(theta)
In Mathematica (which allows this)
D[(Cos[theta]^2 - 1)^2, Cos[theta]]
gives
4 Cos[theta] (-1 + Cos[theta]^2)
Perhaps SymPy over-corrected. If the expression has a single generator matching the function of interest then the substitution-equivalent differentiation could take place. Cases which shouldn't (probably) be allowed are (x + 1).diff(cos(x)), sin(x).diff(cos(x)), etc... But (cos(x)**2 - 1).diff(cos(x)) should (probably) be ok. As #Nasser has indicated, a simple substitution/differentiation/backsubstitution will work.

SymPy cannot solve the equation cos(x) = - 1 /cosh(x)

In Wolfram|Alpha, one can solve cos λ = -1/cosh λ:
λ = ± 1.87510406871196...
λ = ± 4.69409113297417...
λ = ± 7.85475743823761...
λ = ± 10.9955407348755...
Why does cos(x) = - 1 /cosh(x) not work in SymPy?
I tried this:
from sympy import *
x = symbols('x', real=True)
eq = cos(x) + 1 /cosh(x)
ans=solve(eq)
print(ans)
# NotImplementedError: multiple generators [cos(x), exp(x)]
# No algorithms are implemented to solve equation cos(x) + 1/(exp(x)/2 + exp(-x)/2)
--------------
(2018/08/21)
Graphing Calculator
https://www.numberempire.com/graphingcalculator.php?functions=cos(x)%2C-1%2Fcosh(x)&xmin=0&xmax=10&ymin=-1.5&ymax=1.5&var=x
https://www.numberempire.com/graphingcalculator.php?functions=cos(x)*cosh(x)%2C-1&xmin=-10&xmax=10&ymin=-1.5&ymax=1.5&var=x
"Sym" in SymPy stands for symbolic. Did WolframAlpha find a symbolic solution? No, it did not; because there isn't one. So, SymPy did not find one, either.
What you got from WolframAlpha is a numeric solution. To get those, there are other Python libraries, most notably SciPy.
However, SymPy can get you numeric solutions too, by calling mpmath under the hood. This is done with nsolve. It takes a second argument, initial point of the search for solution, and returns one solution.
>>> nsolve(eq, 0)
7.85475743823761
If you want more, try multiple starting points:
>>> {nsolve(eq, n) for n in range(-10, 10)}
{4.69409113297418, -1.87510406871196, 7.85475743823761, -7.85475743823761, 1.87510406871196, -10.9955407348755, -4.69409113297418}
Here I tried 20 starting points, some roots were repeated, hence the use of a set to eliminate the repetition.
There are infinitely many solutions; whatever tool is used, you'll only get several of those. But for large x, 1/cosh(x) is effectively 0, so the roots are approximately the same as cos(x) = 0, which are pi/2 + pi*k, any integer k.

pywavelet signal reconstruction

I am trying to understand the concept of wavelets using the pywavelet library. My first step was to see how I could reconstruct a given input signal using the wavelet coefficients. Please see my code below:
db1 = pywt.Wavelet('db1')
cA6, cD6,cD5, cD4, cD3, cD2, cD1=pywt.wavedec(data, db1, level=6)
cA6cD_approx = pywt.upcoef('a',cA6,'db1',take=n, level=6) + pywt.upcoef('d',cD1,'db1',take=n, level=6)\
+pywt.upcoef('d',cD2,'db1',take=n, level=6) + pywt.upcoef('d',cD3,'db1',take=n, level=6) + \
pywt.upcoef('d',cD4,'db1',take=n, level=6) + pywt.upcoef('d',cD5,'db1',take=n, level=6) + \
pywt.upcoef('d',cD6,'db1',take=n, level=6)
plt.figure(figsize=(28,10))
p1, =plt.plot(t, cA6cD_approx,'r')
p2, =plt.plot(t, data, 'b')
plt.xlabel('Day')
plt.ylabel('Number of units sold')
plt.legend([p2,p1], ["original signal", "cA6+cD* reconstructed"])
plt.show()
This yielded the following plot:
Now, when I used the waverec() method, the signal reconstruction was quite accurate. Please see plot below:
Can someone please explain the difference between the two reconstruction methods?
They are both Inverse Discrete Wavelet Transform "upcoef" is a direct reconstruction using the coefficients while "waverec" is a Multilevel 1D Inverse Discrete Wavelet Transform, doing pretty much the same thing, but doing it in a way that allows you to line up your coefficients and be more efficient when developing.
I changed a little bit, especially the setting for "level". From the plot, you will see two ways of reconstruct will produce the same result.
import numpy as np
import pywt
import matplotlib.pyplot as plt
data = np.loadtxt('Mysample_test.txt')
n = len(data)
wl = pywt.Wavelet("db1")
coeff_all = pywt.wavedec(data, wl, level=6)
cA6, cD6,cD5, cD4, cD3, cD2, cD1= coeff_all
omp0 = pywt.upcoef('a',cA6,wl,level=6)[:n]
omp1 = pywt.upcoef('d',cD1,wl,level=1)[:n]
omp2 = pywt.upcoef('d',cD2,wl,level=2)[:n]
omp3 = pywt.upcoef('d',cD3,wl,level=3)[:n]
omp4 = pywt.upcoef('d',cD4,wl,level=4)[:n]
omp5 = pywt.upcoef('d',cD5,wl,level=5)[:n]
omp6 = pywt.upcoef('d',cD6,wl,level=6)[:n]
#cA6cD_approx = omp0 + omp1 + omp2 + omp3 + omp4+ omp5 + omp6
#plt.figure(figsize=(18,9))
recon = pywt.waverec(coeff_all, wavelet= wl)
p1, =plt.plot(omp0 + omp6 + omp5 + omp4 + omp3 + omp2 + omp1,'r')
p2, =plt.plot(data, 'b')
p3, =plt.plot(recon, 'y')
plt.xlabel('Day')
plt.ylabel('Number of units sold')
plt.legend([p3,p2,p1], ["waverec reconstructed","original signal", "cA6+cD* reconstructed"])
plt.show()
The function wavedec performs a tree decomposition, which means a filtering followed by a downsampling (of a factor 2 for a dyadic scheme).
Both functions waverec and upcoef can lead to reconstruction.
The first one, waverec, performs a direct tree reconstruction symmetrical to what is done by wavedec, which means an upsampling followed by a filtering. At each reconstruction level (6 in your case) a summation is also performed to yield a signal with more details to be used for the next reconstruction level.
The second function, upcoef, allows to perform the independent reconstruction of a given subscale without considering the rest of the details contained in the other subscales. This is usually performed by zero padding when rebuilding the signal. In other words, upcoef can be seen like an interpolation operator.
In your case, you used upcoef to interpolate all the wavelet subscales from their decimated x-grid to the original x-grid. You then performed the summation of all the interpolated signals (only containing a defined and limited quantity of details). Because Daubechies' wavelets are orthogonal, they lead to a perfect reconstruction and this way you can get your original signal back after reconstruction.
In short:
waverec => direct reconstruction => original signal
n times upcoef => interpolation followed by a global summation => original signal
Subscales interpolation is only useful when you want to visualise all the details on the same non-decimated x-grid frame. Such an interpolation brings nothing more since the quantity of information contained in any subscale and its interpolated version is the same.

Detecting mulicollinear , or columns that have linear combinations while modelling in Python : LinAlgError

I am modelling data for a logit model with 34 dependent variables,and it keep throwing in the singular matrix error , as below -:
Traceback (most recent call last):
File "<pyshell#1116>", line 1, in <module>
test_scores = smf.Logit(m['event'], train_cols,missing='drop').fit()
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/discrete/discrete_model.py", line 1186, in fit
disp=disp, callback=callback, **kwargs)
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/discrete/discrete_model.py", line 164, in fit
disp=disp, callback=callback, **kwargs)
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/base/model.py", line 357, in fit
hess=hess)
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/base/model.py", line 405, in _fit_mle_newton
newparams = oldparams - np.dot(np.linalg.inv(H),
File "/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 445, in inv
return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
File "/usr/local/lib/python2.7/site-packages/numpy/linalg/linalg.py", line 328, in solve
raise LinAlgError, 'Singular matrix'
LinAlgError: Singular matrix
Which was when I stumpled on this method to reduce the matrix to its independent columns
def independent_columns(A, tol = 0):#1e-05):
"""
Return an array composed of independent columns of A.
Note the answer may not be unique; this function returns one of many
possible answers.
https://stackoverflow.com/q/13312498/190597 (user1812712)
http://math.stackexchange.com/a/199132/1140 (Gerry Myerson)
http://mail.scipy.org/pipermail/numpy-discussion/2008-November/038705.html
(Anne Archibald)
>>> A = np.array([(2,4,1,3),(-1,-2,1,0),(0,0,2,2),(3,6,2,5)])
2 4 1 3
-1 -2 1 0
0 0 2 2
3 6 2 5
# try with checking the rank of matrixs
>>> independent_columns(A)
np.array([[1, 4],
[2, 5],
[3, 6]])
"""
Q, R = linalg.qr(A)
independent = np.where(np.abs(R.diagonal()) > tol)[0]
#print independent
return A[:, independent], independent
A,independent_col_indexes=independent_columns(train_cols.as_matrix(columns=None))
#train_cols will not be converted back from a df to a matrix object,so doing this explicitly
A2=pd.DataFrame(A, columns=train_cols.columns[independent_col_indexes])
test_scores = smf.Logit(m['event'],A2,missing='drop').fit()
I still get the LinAlgError , though I was hoping I will have the reduced matrix rank now.
Also, I see np.linalg.matrix_rank(train_cols) returns 33 (ie. before calling on the independent_columns function total "x" columns was 34(ie, len(train_cols.ix[0])=34 ), meaning I don't have a full rank matrix), while np.linalg.matrix_rank(A2) returns 33 (meaning I have dropped a columns, and yet I still see the LinAlgError , when I run test_scores = smf.Logit(m['event'],A2,missing='drop').fit() , what am I missing ?
reference to the code above -
How to find degenerate rows/columns in a covariance matrix
I tried to start building the model forward,by introducing each variable at a time, which doesn't give me the singular matrix error, but I would rather have a method that is deterministic, and lets me know, what am I doing wrong & how to eliminate these columns.
Edit (updated post the suggestions by #
user333700 below)
1. You are right, "A2" doesn't have the reduced rank of 33 . ie. len(A2.ix[0]) =34 -> meaning the possibly collinear columns are not dropped - should I increase the "tol", tolerance to get rank of A2 (and the numbers of columns thereof) , as 33. If I change the tol to "1e-05" above, then I do get len(A2.ix[0]) =33, which suggests to me that tol >0 (strictly) is one indicator.
After this I just did the same, test_scores = smf.Logit(m['event'],A2,missing='drop').fit(), without nm to get the convergence.
2. Errors post trying 'nm' method. Strange thing though is that if I take just 20,000 rows, I do get the results. Since it is not showing up Memory error, but "Inverting hessian failed, no bse or cov_params available" - I am assuming, there are multiple nearly-similar records - what would you say ?
m = smf.Logit(data['event_custom'].ix[0:1000000] , train_cols.ix[0:1000000],missing='drop')
test_scores=m.fit(start_params=None,method='nm',maxiter=200,full_output=1)
Warning: Maximum number of iterations has been exceeded
Warning (from warnings module):
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/base/model.py", line 374
warn(warndoc, Warning)
Warning: Inverting hessian failed, no bse or cov_params available
test_scores.summary()
Traceback (most recent call last):
File "<pyshell#17>", line 1, in <module>
test_scores.summary()
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/discrete/discrete_model.py", line 2396, in summary
yname_list)
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/discrete/discrete_model.py", line 2253, in summary
use_t=False)
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/iolib/summary.py", line 826, in add_table_params
use_t=use_t)
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/iolib/summary.py", line 447, in summary_params
std_err = results.bse
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/tools/decorators.py", line 95, in __get__
_cachedval = self.fget(obj)
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/base/model.py", line 1037, in bse
return np.sqrt(np.diag(self.cov_params()))
File "/usr/local/lib/python2.7/site-packages/statsmodels-0.5.0-py2.7-linux-i686.egg/statsmodels/base/model.py", line 1102, in cov_params
raise ValueError('need covariance of parameters for computing '
ValueError: need covariance of parameters for computing (unnormalized) covariances
Edit 2: (updated post the suggestions by #user333700 below)
Reiterating what I am trying to model - less than about 1% of total
users "convert" (success outcomes) - so I took a balanced sample of
35(+ve) /65 (-ve)
I suspect the model is not robust, though it converges. So, will use "start_params" as the params from earlier iteration, from a different dataset.
This edit is about confirming is the "start_params" can feed into the results as below -:
A,independent_col_indexes=independent_columns(train_cols.as_matrix(columns=None))
A2=pd.DataFrame(A, columns=train_cols.columns[independent_col_indexes])
m = smf.Logit(data['event_custom'], A2,missing='drop')
#m = smf.Logit(data['event_custom'], train_cols,missing='drop')#,method='nm').fit()#This doesnt work, so tried 'nm' which work, but used lasso, as nm did not converge.
test_scores=m.fit_regularized(start_params=None, method='l1', maxiter='defined_by_method', full_output=1, disp=1, callback=None, alpha=0, \
trim_mode='auto', auto_trim_tol=0.01, size_trim_tol=0.0001, qc_tol=0.03)
a_good_looking_previous_result.params=test_scores.params #storing the parameters of pass1 to feed into pass2
test_scores.params
bidfloor_Quartile_modified_binned_0 0.305765
connectiontype_binned_0 -0.436798
day_custom_binned_Fri -0.040269
day_custom_binned_Mon 0.138599
day_custom_binned_Sat -0.319997
day_custom_binned_Sun -0.236507
day_custom_binned_Thu -0.058922
user_agent_device_family_binned_iPad -10.793270
user_agent_device_family_binned_iPhone -8.483099
user_agent_masterclass_binned_apple 9.038889
user_agent_masterclass_binned_generic -0.760297
user_agent_masterclass_binned_samsung -0.063522
log_height_width 0.593199
log_height_width_ScreenResolution -0.520836
productivity -1.495373
games 0.706340
entertainment -1.806886
IAB24 2.531467
IAB17 0.650327
IAB14 0.414031
utilities 9.968253
IAB1 1.850786
social_networking -2.814148
IAB3 -9.230780
music 0.019584
IAB9 -0.415559
C(time_day_modified)[(6, 12]]:C(country)[AUS] -0.103003
C(time_day_modified)[(0, 6]]:C(country)[HKG] 0.769272
C(time_day_modified)[(6, 12]]:C(country)[HKG] 0.406882
C(time_day_modified)[(0, 6]]:C(country)[IDN] 0.073306
C(time_day_modified)[(6, 12]]:C(country)[IDN] -0.207568
C(time_day_modified)[(0, 6]]:C(country)[IND] 0.033370
... more params here
Now on a different dataset(pass2, for indexing), I model the same as below -:
ie. I read a new dataframe, do all the variable transformation and then model via Logit as earlier .
m_pass2 = smf.Logit(data['event_custom'], A2_pass2,missing='drop')
test_scores_pass2=m_pass2.fit_regularized(start_params=a_good_looking_previous_result.params, method='l1', maxiter='defined_by_method', full_output=1, disp=1, callback=None, alpha=0, \
trim_mode='auto', auto_trim_tol=0.01, size_trim_tol=0.0001, qc_tol=0.03)
and, possibly keep iterating by picking up "start_params" from earlier passes.
Several points to this:
You need tol > 0 to detect near perfect collinearity, which might also cause numerical problems in later calculations.
Check the number of columns of A2 to see whether a column has really be dropped.
Logit needs to do some non-linear calculations with the exog, so even if the design matrix is not very close to perfect collinearity, the transformed variables for the log-likelihood, derivative or Hessian calculations might still end up being with numerical problems, like singular Hessian.
(All these are floating point problems when we work near floating point precision, 1e-15, 1e-16. There are sometimes differences in the default thresholds for matrix_rank and similar linalg functions which can imply that in some edge cases one function identifies it as singular and another one doesn't.)
The default optimization method for the discrete models including Logit is a simple Newton method, which is fast in reasonably nice cases, but can fail in cases that are badly conditioned. You could try one of the other optimizers which will be one of those in scipy.optimize, method='nm' is usually very robust but slow, method='bfgs' works well in many cases but also can run into convergence problems.
Nevertheless, even when one of the other optimization methods succeeds, it is still necessary to inspect the results. More often than not, a failure with one method means that the model or estimation problem might not be well defined.
A good way to check whether it is just a problem with bad starting values or a specification problem is to run method='nm' first and then run one of the more accurate methods like newton or bfgs using the nm estimate as starting value, and see whether it succeeds from good starting values.

SAS: build linear regression model

I need to do build a simple regression model around the productivity of various land units in SAS but I am fairly new to it. I have the following parameters:
Prodbanana / ELEVATION / SLOPE / SOILTYPE
The productivity vaules of banana are in kg/ha; The elevation and slope parameters are already classified (6 classes) as well as SOILTYPE (6 classes)
The model should be: Productivity = x1* ELEVATION + x2*SLOPE + x3*SOILTYPE
I tried so far:
proc glm data=Work.Banana2;
class ELEVATION SLOPE SOILTYPE;
model Prodbanana = ELEVATION + SLOPE + SOILTYPE;
run;
But it returns the following error:`
45 proc glm data=Work.Banana2;
46 class ELEVATION SLOPE SOILTYPE;
47 model Prodbanana = ELEVATION + SLOPE + SOILTYPE;
-
22
-----
202
NOTE: The previous statement has been deleted.
ERROR 22-322: Syntax error, expecting one of the following: a name, ;, (, *, -, /, #,
CHARACTER, CHAR, NUMERIC, |.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
48 run;`
Any suggestions?
cheers
The addition operator is unnecessary. You can find a quick guide to model specification syntax
here. Works across SAS regression procedures.