I am using IPOPT via Pyomo (the AMPL interface) to solve a simple problem and am trying to validate that the primal Lagrangian gradient is zero at the solution. I'm running the following script, in which I construct what I would expect to be the gradient of the Lagrangian with respect to primal variables.
import pyomo.environ as pyo
from pyomo.common.collections import ComponentMap
m = pyo.ConcreteModel()
m.ipopt_zL_out = pyo.Suffix(direction=pyo.Suffix.IMPORT)
m.ipopt_zU_out = pyo.Suffix(direction=pyo.Suffix.IMPORT)
m.ipopt_zL_in = pyo.Suffix(direction=pyo.Suffix.EXPORT)
m.ipopt_zU_in = pyo.Suffix(direction=pyo.Suffix.EXPORT)
m.dual = pyo.Suffix(direction=pyo.Suffix.IMPORT_EXPORT)
m.v1 = pyo.Var(initialize=-2.0)
m.v2 = pyo.Var(initialize=2.0)
m.v3 = pyo.Var(initialize=2.0)
m.v1.setlb(-10.0)
m.v2.setlb(1.5)
m.v1.setub(-1.0)
m.v2.setub(10.0)
m.eq_con = pyo.Constraint(expr=m.v1*m.v2*m.v3 - 2.0 == 0)
obj_factor = 1
m.obj = pyo.Objective(
expr=obj_factor*(m.v1**2 + m.v2**2 + m.v3**2),
sense=pyo.minimize,
)
solver = pyo.SolverFactory("ipopt")
solver.solve(m, tee=True)
grad_lag_map = ComponentMap()
grad_lag_map[m.v1] = (
(obj_factor*2*m.v1) + m.dual[m.eq_con]*m.v2*m.v3 +
m.ipopt_zL_out[m.v1] + m.ipopt_zU_out[m.v1]
)
grad_lag_map[m.v2] = (
(obj_factor*2*m.v2) + m.dual[m.eq_con]*m.v1*m.v3 +
m.ipopt_zL_out[m.v2] + m.ipopt_zU_out[m.v2]
)
grad_lag_map[m.v3] = (
(obj_factor*2*m.v3) + m.dual[m.eq_con]*m.v1*m.v2
)
for var, expr in grad_lag_map.items():
print(var.name, pyo.value(expr))
According to this, however, the gradient of the Lagrangian is not zero when constructed in this way. I can get the gradient of the Lagrangian to be zero by using the following lines to construct grad_lag_map
grad_lag_map[m.v1] = (
-(obj_factor*2*m.v1) + m.dual[m.eq_con]*m.v2*m.v3 +
m.ipopt_zL_out[m.v1] + m.ipopt_zU_out[m.v1]
)
grad_lag_map[m.v2] = (
-(obj_factor*2*m.v2) + m.dual[m.eq_con]*m.v1*m.v3 +
m.ipopt_zL_out[m.v2] + m.ipopt_zU_out[m.v2]
)
grad_lag_map[m.v3] = (
-(obj_factor*2*m.v3) + m.dual[m.eq_con]*m.v1*m.v2
)
With a minus sign in front of the objective gradient, the gradient of the Lagrangian is zero. This is surprising to me. I would not expect to see this factor of -1 for minimization problems. Can anybody confirm whether IPOPT constructs its Lagrangian with this -1 factor for minimization problems, or whether this is the artifact of some other convention I am unaware of?
This is the Gradient of the Lagrangian w.r.t. x computed in Ipopt (https://github.com/coin-or/Ipopt/blob/2b1a2f9a60fb3f8426b47edbe3b3520c7335d201/src/Algorithm/IpIpoptCalculatedQuantities.cpp#L2018-L2023):
tmp->Copy(*curr_grad_f());
tmp->AddTwoVectors(1., *curr_jac_cT_times_curr_y_c(), 1., *curr_jac_dT_times_curr_y_d(), 1.);
ip_nlp_->Px_L()->MultVector(-1., *z_L, 1., *tmp);
ip_nlp_->Px_U()->MultVector(1., *z_U, 1., *tmp);
This corresponds to an Ipopt-internal representation of a NLP, which has the form
min f(x) dual vars:
s.t. c(x) = 0, y_c
d(x) - s = 0, y_d
d_L <= s <= d_U, v_L, v_U
x_L <= x <= x_U z_L, z_U
The Lagragian for Ipopt is then
f(x) + y_c c(x) + y_d (d(x) - s) + v_L (d_L-s) + v_U (s-d_U) + z_L (x_L-x) + z_U (x-x_U)
and the gradient w.r.t. x is thus
f'(x) + y_c c'(x) + y_d d'(x) - z_L + z_U
The NLP that is used by most Ipopt interfaces is
min f(x) duals:
s.t. g_L <= g(x) <= g_U lambda
x_L <= x <= x_U z_L, z_U
The Gradient of the Lagrangian would be
f'(x) + lambda g'(x) - z_L + z_U
In your code, you have a wrong sign for z_L.
Related
I need to solve this inequality:
enter image description here
import sympy as sp
from sympy.abc import x,n
epsilon = 0.01
a = 2/3
xn = (n**3 + n**2)**(1/3) - (n**3 - n**2)**(1/3)
i tried to solve it like this:
ans = sp.solve_univariate_inequality(sp.Abs(xn-a) < epsilon,n,relational=False)
but received:
NotImplementedError: The inequality, Abs((x3 -
x2)0.333333333333333 - (x3 + x**2)**0.333333333333333 + 2/3) <
0.01, cannot be solved using solve_univariate_inequality.
And tried so, but it didn't work
ans = sp.solve_poly_inequality(sp.Poly(xn-2/3-0.01, n, domain='ZZ'), '==')
but received:
sympy.polys.polyerrors.PolynomialError: (n3 +
n2)**0.333333333333333 contains an element of the set of generators.
Same:
ans = sp.solveset(sp.Abs(xn-2/3) < 0.01, n)
ConditionSet(n, Abs((n3 - n2)0.333333333333333 - (n3 +
n**2)**0.333333333333333 + 0.666666666666667) < 0.01, Complexes)
How can this inequality be solved?
from sympy import *
from sympy.abc import x,n
a = S(2) / 3
epsilon = 0.01
xn = (n**3 + n**2)**(S(1)/3) - (n**3 - n**2)**(S(1)/3)
ineq = Abs(xn-a) < epsilon
Let's verify where ineq is satisfied with plot_implicit. Note that the left hand side of the inequality is valid only for n>=1.
plot_implicit(ineq, (n, 0, 5), adaptive=False)
So, the inequality is satisfied for values of n greater than 3.something.
We can use a numerical approach to find the root of this equation: Abs(xn-a) - epsilon.
from scipy.optimize import bisect
func = lambdify([n], Abs(xn-a) - epsilon)
bisect(func, 1, 5)
# out: 3.5833149415284424
Can lp_solve return a unifrom solution? (Is there a flag or something that will force this kinf of behavior?)
Say that I have this:
max: x + y + z + w;
x + y + z + w <= 100;
Results in:
Actual values of the variables:
x 100
y 0
z 0
w 0
However, I would like to have something like:
Actual values of the variables:
x 25
y 25
z 25
w 25
This is an oversimplyfied example, but the idea is that if the variables have the same factor in the objective function, then the result should idealy be more uniform and not everything for one, and the other what is left.
Is this possible to do? (I've tested other libs and some of them seem to do this by default like the solver on Excel or Gekko for Python).
EDIT:
For instance, Gekko has already this behavior without me especifing anything...
from gekko import GEKKO
m = GEKKO()
x1,x2,x3,x4 = [m.Var() for i in range(4)]
#upper bounds
x1.upper = 100
x2.upper = 100
x3.upper = 100
x4.upper = 100
# Constrain
m.Equation(x1 + x2 + x3 + x4 <= 100)
# Objective
m.Maximize(x1 + x2 + x3 + x4)
m.solve(disp=False)
print(x1.value, x2.value, x3.value, x4.value)
>> [24.999999909] [24.999999909] [24.999999909] [24.999999909]
You would need to explicitly model this (as another objective). A solver does nothing automatically: it just finds a solution that obeys the constraints and optimizes the objective function.
Also, note that many linear solvers will produce so-called basic solutions (corner points). So "all variables in the middle" does not come naturally at all.
The example in Gekko ended on [25,25,25,25] because of how the solver took a step towards the solution from an initial guess of [0,0,0,0] (default in Gekko). The problem is under-specified so there are an infinite number of feasible solutions. Changing the guess values gives a different solution.
from gekko import GEKKO
m = GEKKO()
x1,x2,x3,x4 = m.Array(m.Var,4,lb=0,ub=100)
x1.value=50 # change initial guess
m.Equation(x1 + x2 + x3 + x4 <= 100)
m.Maximize(x1 + x2 + x3 + x4)
m.solve(disp=False)
print(x1.value, x2.value, x3.value, x4.value)
Solution with guess values [50,0,0,0]
[3.1593723566] [32.280209196] [32.280209196] [32.280209196]
Here is one method with equality constraints m.Equations([x1==x2,x1==x3,x1==x4]) to modify the problem to guarantee a unique solution that can be used by any linear programming solver.
from gekko import GEKKO
m = GEKKO()
x1,x2,x3,x4 = m.Array(m.Var,4,lb=0,ub=100)
x1.value=50 # change initial guess
m.Equation(x1 + x2 + x3 + x4 <= 100)
m.Maximize(x1 + x2 + x3 + x4)
m.Equations([x1==x2,x1==x3,x1==x4])
m.solve(disp=False)
print(x1.value, x2.value, x3.value, x4.value)
This gives a solution:
[25.000000002] [25.000000002] [25.000000002] [25.000000002]
QP Solution
Switching to a QP solver allows a slight penalty for deviations but doesn't consume a degree of freedom.
from gekko import GEKKO
m = GEKKO()
x1,x2,x3,x4 = m.Array(m.Var,4,lb=0,ub=100)
x1.value=50 # change initial guess
m.Equation(x1 + x2 + x3 + x4 <= 100)
m.Maximize(x1 + x2 + x3 + x4)
penalty = 1e-5
m.Minimize(penalty*(x1-x2)**2)
m.Minimize(penalty*(x1-x3)**2)
m.Minimize(penalty*(x1-x4)**2)
m.solve(disp=False)
print(x1.value, x2.value, x3.value, x4.value)
Solution with QP penalty
[24.999998377] [25.000000544] [25.000000544] [25.000000544]
I did use the findcontours() method to extract contour from the image, but I have no idea how to calculate the curvature from a set of contour points. Can somebody help me? Thank you very much!
While the theory behind Gombat's answer is correct, there are some errors in the code as well as in the formulae (the denominator t+n-x should be t+n-t). I have made several changes:
use symmetric derivatives to get more precise locations of curvature maxima
allow to use a step size for derivative calculation (can be used to reduce noise from noisy contours)
works with closed contours
Fixes:
* return infinity as curvature if denominator is 0 (not 0)
* added square calculation in denominator
* correct checking for 0 divisor
std::vector<double> getCurvature(std::vector<cv::Point> const& vecContourPoints, int step)
{
std::vector< double > vecCurvature( vecContourPoints.size() );
if (vecContourPoints.size() < step)
return vecCurvature;
auto frontToBack = vecContourPoints.front() - vecContourPoints.back();
std::cout << CONTENT_OF(frontToBack) << std::endl;
bool isClosed = ((int)std::max(std::abs(frontToBack.x), std::abs(frontToBack.y))) <= 1;
cv::Point2f pplus, pminus;
cv::Point2f f1stDerivative, f2ndDerivative;
for (int i = 0; i < vecContourPoints.size(); i++ )
{
const cv::Point2f& pos = vecContourPoints[i];
int maxStep = step;
if (!isClosed)
{
maxStep = std::min(std::min(step, i), (int)vecContourPoints.size()-1-i);
if (maxStep == 0)
{
vecCurvature[i] = std::numeric_limits<double>::infinity();
continue;
}
}
int iminus = i-maxStep;
int iplus = i+maxStep;
pminus = vecContourPoints[iminus < 0 ? iminus + vecContourPoints.size() : iminus];
pplus = vecContourPoints[iplus > vecContourPoints.size() ? iplus - vecContourPoints.size() : iplus];
f1stDerivative.x = (pplus.x - pminus.x) / (iplus-iminus);
f1stDerivative.y = (pplus.y - pminus.y) / (iplus-iminus);
f2ndDerivative.x = (pplus.x - 2*pos.x + pminus.x) / ((iplus-iminus)/2*(iplus-iminus)/2);
f2ndDerivative.y = (pplus.y - 2*pos.y + pminus.y) / ((iplus-iminus)/2*(iplus-iminus)/2);
double curvature2D;
double divisor = f1stDerivative.x*f1stDerivative.x + f1stDerivative.y*f1stDerivative.y;
if ( std::abs(divisor) > 10e-8 )
{
curvature2D = std::abs(f2ndDerivative.y*f1stDerivative.x - f2ndDerivative.x*f1stDerivative.y) /
pow(divisor, 3.0/2.0 ) ;
}
else
{
curvature2D = std::numeric_limits<double>::infinity();
}
vecCurvature[i] = curvature2D;
}
return vecCurvature;
}
For me curvature is:
where t is the position inside the contour and x(t) resp. y(t) return the related x resp. y value. See here.
So, according to my definition of curvature, one can implement it this way:
std::vector< float > vecCurvature( vecContourPoints.size() );
cv::Point2f posOld, posOlder;
cv::Point2f f1stDerivative, f2ndDerivative;
for (size_t i = 0; i < vecContourPoints.size(); i++ )
{
const cv::Point2f& pos = vecContourPoints[i];
if ( i == 0 ){ posOld = posOlder = pos; }
f1stDerivative.x = pos.x - posOld.x;
f1stDerivative.y = pos.y - posOld.y;
f2ndDerivative.x = - pos.x + 2.0f * posOld.x - posOlder.x;
f2ndDerivative.y = - pos.y + 2.0f * posOld.y - posOlder.y;
float curvature2D = 0.0f;
if ( std::abs(f2ndDerivative.x) > 10e-4 && std::abs(f2ndDerivative.y) > 10e-4 )
{
curvature2D = sqrt( std::abs(
pow( f2ndDerivative.y*f1stDerivative.x - f2ndDerivative.x*f1stDerivative.y, 2.0f ) /
pow( f2ndDerivative.x + f2ndDerivative.y, 3.0 ) ) );
}
vecCurvature[i] = curvature2D;
posOlder = posOld;
posOld = pos;
}
It works on non-closed pointlists as well. For closed contours, you may would like to change the boundary behavior (for the first iterations).
UPDATE:
Explanation for the derivatives:
A derivative for a continuous 1 dimensional function f(t) is:
But we are in a discrete space and have two discrete functions f_x(t) and f_y(t) where the smallest step for t is one.
The second derivative is the derivative of the first derivative:
Using the approximation of the first derivative, it yields to:
There are other approximations for the derivatives, if you google it, you will find a lot.
Here's a python implementation mainly based on Philipp's C++ code. For those interested, more details on the derivation can be found in Chapter 10.4.2 of:
Klette & Rosenfeld, 2004: Digital Geometry
def getCurvature(contour,stride=1):
curvature=[]
assert stride<len(contour),"stride must be shorther than length of contour"
for i in range(len(contour)):
before=i-stride+len(contour) if i-stride<0 else i-stride
after=i+stride-len(contour) if i+stride>=len(contour) else i+stride
f1x,f1y=(contour[after]-contour[before])/stride
f2x,f2y=(contour[after]-2*contour[i]+contour[before])/stride**2
denominator=(f1x**2+f1y**2)**3+1e-11
curvature_at_i=np.sqrt(4*(f2y*f1x-f2x*f1y)**2/denominator) if denominator > 1e-12 else -1
curvature.append(curvature_at_i)
return curvature
EDIT:
you can use convexityDefects from openCV, here's a link
a code example to find fingers based in their contour (variable res) source
def calculateFingers(res,drawing): # -> finished bool, cnt: finger count
# convexity defect
hull = cv2.convexHull(res, returnPoints=False)
if len(hull) > 3:
defects = cv2.convexityDefects(res, hull)
if type(defects) != type(None): # avoid crashing. (BUG not found)
cnt = 0
for i in range(defects.shape[0]): # calculate the angle
s, e, f, d = defects[i][0]
start = tuple(res[s][0])
end = tuple(res[e][0])
far = tuple(res[f][0])
a = math.sqrt((end[0] - start[0]) ** 2 + (end[1] - start[1]) ** 2)
b = math.sqrt((far[0] - start[0]) ** 2 + (far[1] - start[1]) ** 2)
c = math.sqrt((end[0] - far[0]) ** 2 + (end[1] - far[1]) ** 2)
angle = math.acos((b ** 2 + c ** 2 - a ** 2) / (2 * b * c)) # cosine theorem
if angle <= math.pi / 2: # angle less than 90 degree, treat as fingers
cnt += 1
cv2.circle(drawing, far, 8, [211, 84, 0], -1)
return True, cnt
return False, 0
in my case, i used about the same function to estimate the bending of board while extracting the contour
OLD COMMENT:
i am currently working in about the same, great information in this post, i'll come back with a solution when i'll have it ready
from Jonasson's answer, Shouldn't be here a tuple on the right side too?, i believe it won't unpack:
f1x,f1y=(contour[after]-contour[before])/stride
f2x,f2y=(contour[after]-2*contour[i]+contour[before])/stride**2
I'm using Newton's method, so I want to find the positions of all six roots of the sixth-order polynomial, basically the points where the function is zero.
I found the rough values on my graph with this code below but want to output those positions of all six roots. I'm thinking of using x as an array to input the values in to find those positions but not sure. I'm using 1.0 for now to locate the rough values. Any suggestions from here??
def P(x):
return 924*x**6 - 2772*x**5 + 3150*x**4 - 1680*x**3 + 420*x**2 - 42*x + 1
def dPdx(x):
return 5544*x**5 - 13860*x**4 + 12600*x**3 - 5040*x**2 + 840*x - 42
accuracy = 1**-10
x = 1.0
xlast = float("inf")
while np.abs(x - xlast) > accuracy:
xlast = x
x = xlast - P(xlast)/dPdx(xlast)
print(x)
p_points = []
x_points = np.linspace(0, 1, 100)
y_points = np.zeros(len(x_points))
for i in range(len(x_points)):
y_points[i] = P(x_points[i])
p_points.append(P(x_points))
plt.plot(x_points,y_points)
plt.savefig("roots.png")
plt.show()
The traditional way is to use deflation to factor out the already found roots. If you want to avoid manipulations of the coefficient array, then you have to divide the roots out.
Having found z[1],...,z[k] as root approximations, form
g(x)=(x-z[1])*(x-z[2])*...*(x-z[k])
and apply Newtons method to h(x)=f(x)/g(x) with h'(x)=f'/g-fg'/g^2. In the Newton iteration this gives
xnext = x - f(x)/( f'(x) - f(x)*g'(x)/g(x) )
Fortunately the quotient g'/g has a simple form
g'(x)/g(x) = 1/(x-z[1])+1/(x-z[2])+...+1/(x-z[k])
So with a slight modification to the Newton step you can avoid finding the same root over again.
This all still keeps the iteration real. To get at the complex root, use a complex number to start the iteration.
Proof of concept, adding eps=1e-8j to g'(x)/g(x) allows the iteration to go complex without preventing real values. Solves the equivalent problem 0=exp(-eps*x)*f(x)/g(x)
import numpy as np
import matplotlib.pyplot as plt
def P(x):
return 924*x**6 - 2772*x**5 + 3150*x**4 - 1680*x**3 + 420*x**2 - 42*x + 1
def dPdx(x):
return 5544*x**5 - 13860*x**4 + 12600*x**3 - 5040*x**2 + 840*x - 42
accuracy = 1e-10
roots = []
for k in range(6):
x = 1.0
xlast = float("inf")
x_points = np.linspace(0.0, 1.0, 200)
y_points = P(x_points)
for rt in roots:
y_points /= (x_points - rt)
y_points = np.array([ max(-1.0,min(1.0,np.real(y))) for y in y_points ])
plt.plot(x_points,y_points,x_points,0*y_points)
plt.show()
while np.abs(x - xlast) > accuracy:
xlast = x
corr = 1e-8j
for rt in roots:
corr += 1/(xlast-rt)
Px = P(xlast)
dPx = dPdx(xlast)
x = xlast - Px/(dPx - Px*corr)
print(x)
roots.append(x)
I am trying to integrate a second order differential equation using 'scipy.integrate.odeint'. My eqution is as follows
m*x[i]''+x[i]'= K/N*sum(j=0 to N)of sin(x[j]-x[i])
which I have converted into two first order ODEs as followed. In the below code, yinit is array of the initial values x(0) and x'(0). My question is what should be the values of x(0) and x'(0) ?
x'[i]=y[i]
y'[i]=(-y[i]+K/N*sum(j=0 to N)of sin(x[j]-x[i]))/m
from numpy import *
from scipy.integrate import odeint
N = 50
def f(theta, t):
global N
x, y = theta
m = 0.95
K = 1.0
fx = zeros(N, float)
for i in range(N):
s = 0.0
for j in range(i+1,N):
s = s + sin(x[j] - x[i])
fx[i] = (-y[i] + (K*s)/N)/m
return array([y, fx])
t = linspace(0, 10, 100, endpoint=False)
Uniformly generating random number
theta = random.uniform(-180, 180, N)
Integrating function f using odeint
yinit = array([x(0), x'(0)])
y = odeint(f, yinit, t)[:,0]
print (y)
You can choose as initial condition whatever you want.
In your case, you decided to use a random initial condition for x for all the oscillators. You can use a random initial condition for 'y' as well I guess, as I did below.
There were a few errors in the above code, mostly on how to unpack x,y from theta and how to repack them at the end (see concatenate below in the corrected code). See also the concatenate for yinit.
The rest are stylish/minor changes.
from numpy import concatenate, linspace, random, mod, zeros, sin
from scipy.integrate import odeint
Nosc = 20
assert mod(Nosc, 2) == 0
def f(theta, _):
N = theta.size / 2
x, y = theta[:N], theta[N:]
m = 0.95
K = 1.0
fx = zeros(N, float)
for i in range(N):
s = 0.0
for j in range(i + 1, N):
s = s + sin(x[j] - x[i])
fx[i] = (-y[i] + (K * s) / N) / m
return concatenate(([y, fx]))
t = linspace(0, 10, 50, endpoint=False)
theta = random.uniform(-180, 180, Nosc)
theta2 = random.uniform(-180, 180, Nosc) #added initial condition for the velocities of the oscillators
yinit = concatenate((theta, theta2))
res = odeint(f, yinit, t)
X = res[:, :Nosc].T
Y = res[:, Nosc:].T
To plot the time evolution of the system, you can use something like
import matplotlib.pylab as plt
fig, ax = plt.subplots()
for displacement in X:
ax.plot(t, displacement)
ax.set_xlabel('t')
ax.set_ylabel('x')
fig.show()
What are you modelling? At first the eq. looked a bit like kuramoto oscillators, but then I noticed you also have a x[i]'' term.
Notice how in your model, as you do not have a spring term in the equation, like a term x(t) at the LHS, the value of x converges to an arbitrary value: