I want to solve an overdetermined system of the form Ax=b where A is a (m x n) matrix (with m>n), b is a (m) vector and x is the vector of the unknowns. I want also to bound the solution with lb and ub.
Giving the following program:
(QP)minimize transpose(x).D.x+transpose(c).x+c0 subject to Ax⋛b,l≤x≤u
I wonder how to calculate the matrix D and the vector c. Because the matrix D has to be symmetric I have defined it as D=transpose(A).A and c as c=-transpose(A).b. My question is: Is this representation correct? If no, how should I define D and c?
"Solving" an overdetermined system Ax = b usually means computing a solution x which minimizes the euclidean norm of the error e(x) = ||Ax-b||. If you have additional linear constraints of the form l <= x <= u then indeed you get a Quadratic Program:
min { 0.5*e(x)^2 } <=> min { 0.5*(Ax-b)'*(Ax-b) }
<=> min { 0.5*x'*A'*A*x -b'Ax + 0.5*b'b) }
<=> min { 0.5*x'*A'*A*x -b'Ax }
subject to the linear constraints
l <= x <= u
So you can define the matrix D to be half the A'*A (A' means A transposed):
D = 1/2*A'*A
and vector c to satisfy
c' = -b'*A => c = -A'*b
So your approach is not correct, but it was close!
Related
Example: let
M = Matrix([[1,2],[3,4]]) # and
p = Poly(x**3 + x + 1) # then
p.subs(x,M).expand()
gives the error :
TypeError: cannot add <class'sympy.matrices.immutable.ImmutableDenseMatrix'> and <class 'sympy.core.numbers.One'>
which is very plausible since the two first terms become matrices but the last term (the constant term) is not a matrix but a scalar. To remediate to this situation I changed the polynomial to
p = Poly(x**3 + x + x**0) # then
the same error persists. Am I obliged to type the expression by hand, replacing x by M? In this example the polynomial has only three terms but in reality I encounter (multivariate polynomials with) dozens of terms.
So I think the question is mainly revolving around the concept of Matrix polynomial:
(where P is a polynomial, and A is a matrix)
I think this is saying that the free term is a number, and it cannot be added with the rest which is a matrix, effectively the addition operation is undefined between those two types.
TypeError: cannot add <class'sympy.matrices.immutable.ImmutableDenseMatrix'> and <class 'sympy.core.numbers.One'>
However, this can be circumvented by defining a function that evaluates the matrix polynomial for a specific matrix. The difference here is that we're using matrix exponentiation, so we correctly compute the free term of the matrix polynomial a_0 * I where I=A^0 is the identity matrix of the required shape:
from sympy import *
x = symbols('x')
M = Matrix([[1,2],[3,4]])
p = Poly(x**3 + x + 1)
def eval_poly_matrix(P,A):
res = zeros(*A.shape)
for t in enumerate(P.all_coeffs()[::-1]):
i, a_i = t
res += a_i * (A**i)
return res
eval_poly_matrix(p,M)
Output:
In this example the polynomial has only three terms but in reality I encounter (multivariate polynomials with) dozens of terms.
The function eval_poly_matrix above can be extended to work for multivariate polynomials by using the .monoms() method to extract monomials with nonzero coefficients, like so:
from sympy import *
x,y = symbols('x y')
M = Matrix([[1,2],[3,4]])
p = poly( x**3 * y + x * y**2 + y )
def eval_poly_matrix(P,*M):
res = zeros(*M[0].shape)
for m in P.monoms():
term = eye(*M[0].shape)
for j in enumerate(m):
i,e = j
term *= M[i]**e
res += term
return res
eval_poly_matrix(p,M,eye(M.rows))
Note: Some sanity checks, edge cases handling and optimizations are possible:
The number of variables present in the polynomial relates to the number of matrices passed as parameters (the former should never be greater than the latter, and if it's lower than some logic needs to be present to handle that, I've only handled the case when the two are equal)
All matrices need to be square as per the definition of the matrix polynomial
A discussion about a multivariate version of the Horner's rule features in the comments of this question. This might be useful to minimize the number of matrix multiplications.
Handle the fact that in a Matrix polynomial x*y is different from y*x because matrix multiplication is non-commutative . Apparently poly functions in sympy do not support non-commutative variables, but you can define symbols with commutative=False and there seems to be a way forward there
About the 4th point above, there is support for Matrix expressions in SymPy, and that can help here:
from sympy import *
from sympy.matrices import MatrixSymbol
A = Matrix([[1,2],[3,4]])
B = Matrix([[2,3],[3,4]])
X = MatrixSymbol('X',2,2)
Y = MatrixSymbol('Y',2,2)
I = eye(X.rows)
p = X**2 * Y + Y * X ** 2 + X ** 3 - I
display(p)
p = p.subs({X: A, Y: B}).doit()
display(p)
Output:
For more developments on this feature follow #18555
I am decomposing a sparse SPD matrix A using Eigen. It will either be a LLt or a LDLt deomposition (Cholesky), so we can assume the matrix will be decomposed as A = P-1 LDLt P where P is a permutation matrix, L is triangular lower and D diagonal (possibly identity). If I do
SolverClassName<SparseMatrix<double> > solver;
solver.compute(A);
To solve Lx=b then is it efficient to do the following?
solver.matrixL().TriangularView<Lower>().solve(b)
Similarly, to solve Px=b then is it efficient to do the following?
solver.permutationPinv()*b
I would like to do this in order to compute bt A-1 b efficiently and stably.
Have a look how _solve_impl is implemented for SimplicialCholesky. Essentially, you can simply write:
Eigen::VectorXd x = solver.permutationP()*b; // P not Pinv!
solver.matrixL().solveInPlace(x); // matrixL is already a triangularView
// depending on LLt or LDLt use either:
double res_llt = x.squaredNorm();
double res_ldlt = x.dot(solver.vectorD().asDiagonal().inverse()*x);
Note that you need to multiply by P and not Pinv, since the inverse of
A = P^-1 L D L^t P is
P^-1 L^-t D^-1 L^-1 P
because the order of matrices reverses when taking the inverse of a product.
This question already has an answer here:
What's algorithm used to solve Linear Diophantine equation: ax + by = c
(1 answer)
Closed 5 years ago.
I am trying to write a code which can input 3 long int variables, a, b, c.
The code should find all integer (x,y) so that ax+by = c, but the input values can be up to 2*10^9. I'm not sure how to do this efficiently. My algorithm is O(n^2), which is really bad for such large inputs. How can I do it better? Here's my code-
typedef long int lint;
struct point
{
lint x, y;
};
int main()
{
lint a, b, c;
vector <point> points;
cin >> c >> a >> b;
for(lint x = 0; x < c; x++)
for(lint y = 0; y < c; y++)
{
point candidate;
if(a*x + b*y == c)
{
candidate.x = x;
candidate.y = y;
points.push_back(candidate);
break;
}
}
}
Seems like you can apply a tiny bit of really trivial math to solve for y for any given value of x. Starting from ax + by = c:
ax + by = c
by = c - ax
Assuming non-zero b1, we then get:
y = (c - ax) / b
With that in hand, we can generate our values of x in the loop, plug it into the equation above, and compute the matching value of y and check whether it's an integer. If so, add that (x, y) pair, and go on to the next value of x.
You could, of course, make the next step and figure out which values of x would result in the required y being an integer, but even without doing that we've moved from O(N2) to O(N), which is likely to be plenty to get the task done in a much more reasonable time frame.
Of course, if b is 0, then the by term is zero, so we have ax = c, which we can then turn into x = c/a, so we then just need to check that x is an integer, and if so all pairs of that x with any candidate value of y will yield the correct c.
I am trying to write a linear program and need a variable z that equals the sign of x-c, where x is another variable, and c is a constant.
I considered z = (x-c)/|x-c|. Unfortunately, if x=c, then this creates division by 0.
I cannot use z=x-c, because I don't want to weight it by the magnitude of the difference between x and c.
Does anyone know of a good way to express z so that it is the sign of x-c?
Thank you for any help and suggestions!
You can't model z = sign(x-c) exactly with a linear program (because the constraints in an LP are restricted to linear combinations of variables).
However, you can model sign if you are willing to convert your linear program into a mixed integer program, you can model this with the following two constraints:
L*b <= x - c <= U*(1-b)
z = 1 - 2*b
Where b is a binary variable, and L and U are lower and upper bounds on the quantity x-c. If b = 0, we have 0 <= x - c <= U and z = 1. If b = 1, we have L <= x - c <= 0 and z = 1 - 2*1 = -1.
You can use a solver like Gurobi to solve mixed integer programs.
For k » 1 this is a smooth approximation of the sign function:
Also
when ε → 0
These two approximations haven't the division by 0 issue but now you must tune a parameter.
In some languages (e.g. C++ / C) you can simply write something like this:
double sgn(double x)
{
return (x > 0.0) - (x < 0.0);
}
Anyway consider that many environments / languages already have a sign function, e.g.
Sign[x] in Mathematica
sign(x) in Matlab
Math.signum(x) in Java
sign(1, x) in Fortran
sign(x) in R
Pay close attention to what happens when x is equal to 0 (e.g. the Fortran function will return 1, with other languages you'll get 0).
Given a function y = f(A,X):
unsigned long F(unsigned long A, unsigned long x) {
return ((unsigned long long)A*X)%4294967295;
}
How would I find the inverse function x = g(A,y) such that x = g(A, f(A,x)) for all values of 'x'?
If f() isn't invertible for all values of 'x', what's the closest to an inverse?
(F is an obsolete PRNG, and I'm trying to understand how one inverts such a function).
Updated
If A is relatively prime to (2^N)-1, then g(A,Y) is just f(A-1, y).
If A isn't relatively prime, then the range of y is constrained...
Does g( ) still exist if restricted to that range?
You need the Extended Euclidean algorithm. This gives you R and S such that
gcd(A,2^N-1) = R * A + S * (2^N-1).
If the gcd is 1 then R is the multiplicative inverse of A. Then the inverse function is
g(A,y) = R*y mod (2^N-1).
Ok, here is an update for the case where the G = Gcd(A, 2^N-1) is not 1. In that case
R*y mod (2^N-1) = R*A*x mod (2^N-1) = G*x mod (2^N-1).
If y was computed by the function f then y is divisible by G. Hence we can divide the equation above by G and get an equation modulo (2^N-1)/G. Thus the set of solutions is
x = R*y/G + k*(2^N-1)/G, where k is an arbitrary integer.
The solution is given here (http://en.wikipedia.org/wiki/Linear_congruence_theorem), and includes a demonstration of how the extended Euclidean algorithm is used to find the solutions.
The modulus function in general does not have an inverse function, but you can sometimes find a set of x's that map to the given y.
Accipitridae, Glenn, and Jeff Moser have the answer between them, but it's worth explaining a little more why not every number has an inverse under mod 4294967295. The reason is that 4294967295 is not a prime number; it is the product of five factors: 3 x 5 x 17 x 257 x 65537. A number x has a mutiplicative inverse under mod m if and only if x and m are coprime, so any number that is a multiple of those factors cannot have an inverse in your function.
This is why the modulus chosen for such PRNGs is usually prime.
You need to compute the inverse of A mod ((2^N) - 1), but you might not always have an inverse given your modulus. See this on Wolfram Alpha.
Note that
A = 12343 has an inverse (A^-1 = 876879007 mod 4294967295)
but 12345 does not have an inverse.
Thus, if A is relatively prime with (2^n)-1, then you can easily create an inverse function using the Extended Euclidean Algorithm where
g(A,y) = F(A^-1, y),
otherwise you're out of luck.
UPDATE: In response to your updated question, you still can't get a unique inverse in the restricted range. Even if you use CookieOfFortune's brute force solution, you'll have problems like
G(12345, F(12345, 4294967294)) == 286331152 != 4294967294.
Eh... here's one that will work:
unsigned long G(unsigned long A, unsigned long y)
{
for(unsigned int i = 0; i < 4294967295; i++)
{
if(y == F(A, i)) return i);
}
}