How to declare relations between SymPy symbols - sympy

I have the following code that's supposed to order a NumPy "matrix" based on the order of the elements of the first "row". I am dealing with SymPy variables, which do not have a straightforward ordering to them.
import sympy as sym
import numpy as np
a = sym.symbols("a", positive=True)
b = sym.symbols("b")
arr_num = np.array([[1.5, 3, 0], [.5, .4, .1]])
arr_sym_a = np.array([[a, 2*a, 0],[.5, .4, .1]])
arr_sym_b = np.array([[a, b, 0],[.5, .4, .1]])
def order(array):
return array[:, np.argsort(array)][:, 0]
print(order(arr_num))
print(order(arr_sym_a))
print(order(arr_sym_b))
For arr_num, I get the expected output:
[[0. 1.5 3. ]
[0.1 0.5 0.4]]
As seen above, I already know how to declare a variable positive so that the np.argsort knows to order 0<a<2*a, and I do get the expected output for order(arr_sym_a) :
[[0 a 2*a]
[0.1 0.5 0.4]]
The question is whether there is a similar way to notify SymPy that b>a and then get
[[0 a b]
[0.1 0.5 0.4]]
So far I have been getting the error message "TypeError: cannot determine truth value of Relational", which is not surprising since there is no way for np.argsort to tell that a>b.
Thanks

With symbols, the array is object dtype:
In [117]: arr_sym_a
Out[117]:
array([[a, 2*a, 0],
[0.5, 0.4, 0.1]], dtype=object)
In [119]: np.argsort(arr_sym_a)
Out[119]:
array([[2, 0, 1],
[2, 1, 0]])
So a row of the array can be sorted:
In [121]: np.sort(arr_sym_a[0])
Out[121]: array([0, a, 2*a], dtype=object)
Individual terms can be ordered:
In [122]: arr_sym_a[0,0]>arr_sym_a[0,1]
Out[122]: False
In [123]: arr_sym_a[0,1]>arr_sym_a[0,2]
Out[123]: True
In [124]: a>2*a
Out[124]: False
In [125]: 0>a
Out[125]: False
But:
In [126]: arr_sym_b[0,0]>arr_sym_b[0,1]
Out[126]: a > b
this is a sympy.Relational, which does not have a simple True/False value
In [127]: bool(arr_sym_b[0,0]>arr_sym_b[0,1])
TypeError: cannot determine truth value of Relational
This is analogous to the array ambiguity error:
In [128]: bool(np.array([1,2])>0)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Off hand I don't see anything in https://docs.sympy.org/latest/modules/core.html#module-sympy.core.relational
to help, but there might be a way of declaring this relational to be True or False.
So while symbols can be used in numpy arrays, the result is an object dtype array. Operations on such an array occur as list-comprehension speed, and are highly dependent on what methods are defined to the individual elements.
sympy.lambdify is the best way to create a numpy compatible function from a sympy expression. But even that is not foolproof.
Notes at the end of the Relational section has more about this 'truth value` issue, https://docs.sympy.org/latest/modules/core.html#module-sympy.core.relational
You can declare assumptions for symbols, but other than the positive that you have used I don't see anything that would declare an ordering between symbols
https://docs.sympy.org/latest/modules/core.html#module-sympy.core.assumptions

Related

Solving system of equations in sympy with matrix variables

I am looking for a matrix that solves a complicated system of equations; i.e., it would be hard to flatten the equations into vector form. Here is a toy example showing the error that I'm getting:
from sympy import nsolve, symbols, Inverse
from sympy.polys.polymatrix import PolyMatrix
import numpy as np
import itertools as itr
nnodes = 2
nodes = list(range(nnodes))
u_mat = PolyMatrix([symbols(f'u{i}{j}') for i, j in itr.product(nodes, nodes)]).reshape(2, 2)
u_mat_inv = Inverse(u_mat)
equations = [
u_mat_inv[0, 0] - 1,
u_mat_inv[0, 1] - 0,
u_mat_inv[1, 0] - 0,
u_mat_inv[1, 1] - 1
]
s = nsolve(equations, u_mat, np.ones(4))
This raises the following error:
TypeError: X must be a row or a column matrix
Is there a way around this without having to write the equations in vector form?
I think nsolve is getting confused because u_mat is a matrix. Passing list(u_mat) gives the input as expected by nsolve. The next problem is your choice of initial guess is a singularity of the system of equations.
You can use normal solve here though:
In [24]: solve(equations, list(u_mat))
Out[24]: [(1, 0, 0, 1)]

Solving constraint satisfaction problems in Sympy

I'm attempting to solve some simple Boolean satisfiability problems in Sympy. Here, I tried to solve a constraint that contains the Or logic operator:
from sympy import *
a,b = symbols("a b")
print(solve(Or(Eq(3, b*2), Eq(3, b*3))))
# In other words: (3 equals b*2) or (3 equals b*3)
# [1,3/2] was the answer that I expected
Surprisingly, this leads to an error instead:
TypeError: unsupported operand type(s) for -: 'Or' and 'int'
I can work around this problem using Piecewise, but this is much more verbose:
from sympy import *
a,b = symbols("a b")
print(solve(Piecewise((Eq(3, b*2),Eq(3, b*2)), (Eq(3, b*3),Eq(3, b*3)))))
#prints [1,3/2], as expected
Unfortunately, this work-around fails when I try to solve for two variables instead of one:
from sympy import *
a,b = symbols("a b")
print(solve([Eq(a,3+b),Piecewise((Eq(b,3),Eq(b,3)), (Eq(b,4),Eq(b,4)))]))
#AttributeError: 'BooleanTrue' object has no attribute 'n'
Is there a more reliable way to solve constraints like this one in Sympy?
To expand on zaq's answer, SymPy doesn't recognize logical operators in solve, but you can use the fact that
a*b = 0
is equivalent to
a = 0 OR b = 0
That is, multiply the two equations
solve((3 - 2*b)*(3 - 3*b), b)
As an additional note, if you wanted to use AND instead of OR, you can solve for a system. That is,
solve([eq1, eq2])
is equivalent to solving
eq1 = 0 AND eq2 = 0
Every equation can be expressed as something being equated to 0. For example, 3-2*b = 0 instead of 3 = 2*b. (In Sympy, you don't even have to write the =0 part, it's assumed.) Then you can simply multiply equations to express the OR logic:
>>> from sympy import *
>>> a,b = symbols("a b")
>>> solve((3-b*2)*(3-b*3))
[1, 3/2]
>>> solve([a-3-b, (3-b*2)*(3-b*3)])
[{b: 1, a: 4}, {b: 3/2, a: 9/2}]

Incorrect assignment of values in 2D list in python

I was trying to assign values to multi dimension list in python after initializing it to zeroes first.
Following is the code where edge_strength and initialProb are multiD list.
edge_strength = [[1,2,3],[3,4,5],[6,7,8]]
initialProb = [[0]*3]*3
initialColumn =1
denominator = 10
for r in range(0,3):
initialProb[r][initialColumn] = float(edge_strength[r][initialColumn])/denominator
print initialProb
But when I finished and printed initialProb list, I got answer as - [[0, 0.7, 0], [0, 0.7, 0], [0, 0.7, 0]]
Instead it should've been - [[0, 0.2, 0], [0, 0.4, 0], [0, 0.7, 0]].
Can anyone explain me this strange behaviour and the workaround?
I don't understand why you solution does not work either. It seems as if edge_strength[r][initialColumn] is broadcasted although initialProb[r][initialColumn] is a scalar. I would expect something similar if you instead wrote
initialProb[1] = float(edge_strength[r][initialColumn])/denominator
but like this it does not make sense.
Here is a workaround using numpy. numpy-arrays have the advantage, that multiple columns can be addressed at once. I hope that helps at least.
import numpy as np
initialProb = np.zeros((3,3))
edge_strength = np.array([[1,2,3],[3,4,5],[6,7,8]])
initialProb[:,initialColumn] = edge_strength[:,initialColumn].astype(np.float)/denominator
Edit: I understood what's going on. Refer to Python list multiplication: [[...]]*3 makes 3 lists which mirror each other when modified When you initialize initialProb you don't get three different rows, but three references to the same list. If you modify the list, all three are changed.
According to the thread
initialProb = [ [0]*3 for r in range(3) ]
should also solve your error

Method for evaluating the unit vector ( or normalising a vector ) in Python or in the numerical libraries: numpy, scipy [duplicate]

I would like to convert a NumPy array to a unit vector. More specifically, I am looking for an equivalent version of this normalisation function:
def normalize(v):
norm = np.linalg.norm(v)
if norm == 0:
return v
return v / norm
This function handles the situation where vector v has the norm value of 0.
Is there any similar functions provided in sklearn or numpy?
If you're using scikit-learn you can use sklearn.preprocessing.normalize:
import numpy as np
from sklearn.preprocessing import normalize
x = np.random.rand(1000)*10
norm1 = x / np.linalg.norm(x)
norm2 = normalize(x[:,np.newaxis], axis=0).ravel()
print np.all(norm1 == norm2)
# True
I agree that it would be nice if such a function were part of the included libraries. But it isn't, as far as I know. So here is a version for arbitrary axes that gives optimal performance.
import numpy as np
def normalized(a, axis=-1, order=2):
l2 = np.atleast_1d(np.linalg.norm(a, order, axis))
l2[l2==0] = 1
return a / np.expand_dims(l2, axis)
A = np.random.randn(3,3,3)
print(normalized(A,0))
print(normalized(A,1))
print(normalized(A,2))
print(normalized(np.arange(3)[:,None]))
print(normalized(np.arange(3)))
This might also work for you
import numpy as np
normalized_v = v / np.sqrt(np.sum(v**2))
but fails when v has length 0.
In that case, introducing a small constant to prevent the zero division solves this.
As proposed in the comments one could also use
v/np.linalg.norm(v)
To avoid zero division I use eps, but that's maybe not great.
def normalize(v):
norm=np.linalg.norm(v)
if norm==0:
norm=np.finfo(v.dtype).eps
return v/norm
If you have multidimensional data and want each axis normalized to its max or its sum:
def normalize(_d, to_sum=True, copy=True):
# d is a (n x dimension) np array
d = _d if not copy else np.copy(_d)
d -= np.min(d, axis=0)
d /= (np.sum(d, axis=0) if to_sum else np.ptp(d, axis=0))
return d
Uses numpys peak to peak function.
a = np.random.random((5, 3))
b = normalize(a, copy=False)
b.sum(axis=0) # array([1., 1., 1.]), the rows sum to 1
c = normalize(a, to_sum=False, copy=False)
c.max(axis=0) # array([1., 1., 1.]), the max of each row is 1
If you don't need utmost precision, your function can be reduced to:
v_norm = v / (np.linalg.norm(v) + 1e-16)
You mentioned sci-kit learn, so I want to share another solution.
sci-kit learn MinMaxScaler
In sci-kit learn, there is a API called MinMaxScaler which can customize the the value range as you like.
It also deal with NaN issues for us.
NaNs are treated as missing values: disregarded in fit, and maintained
in transform. ... see reference [1]
Code sample
The code is simple, just type
# Let's say X_train is your input dataframe
from sklearn.preprocessing import MinMaxScaler
# call MinMaxScaler object
min_max_scaler = MinMaxScaler()
# feed in a numpy array
X_train_norm = min_max_scaler.fit_transform(X_train.values)
# wrap it up if you need a dataframe
df = pd.DataFrame(X_train_norm)
Reference
[1] sklearn.preprocessing.MinMaxScaler
There is also the function unit_vector() to normalize vectors in the popular transformations module by Christoph Gohlke:
import transformations as trafo
import numpy as np
data = np.array([[1.0, 1.0, 0.0],
[1.0, 1.0, 1.0],
[1.0, 2.0, 3.0]])
print(trafo.unit_vector(data, axis=1))
If you work with multidimensional array following fast solution is possible.
Say we have 2D array, which we want to normalize by last axis, while some rows have zero norm.
import numpy as np
arr = np.array([
[1, 2, 3],
[0, 0, 0],
[5, 6, 7]
], dtype=np.float)
lengths = np.linalg.norm(arr, axis=-1)
print(lengths) # [ 3.74165739 0. 10.48808848]
arr[lengths > 0] = arr[lengths > 0] / lengths[lengths > 0][:, np.newaxis]
print(arr)
# [[0.26726124 0.53452248 0.80178373]
# [0. 0. 0. ]
# [0.47673129 0.57207755 0.66742381]]
If you want to normalize n dimensional feature vectors stored in a 3D tensor, you could also use PyTorch:
import numpy as np
from torch import FloatTensor
from torch.nn.functional import normalize
vecs = np.random.rand(3, 16, 16, 16)
norm_vecs = normalize(FloatTensor(vecs), dim=0, eps=1e-16).numpy()
If you're working with 3D vectors, you can do this concisely using the toolbelt vg. It's a light layer on top of numpy and it supports single values and stacked vectors.
import numpy as np
import vg
x = np.random.rand(1000)*10
norm1 = x / np.linalg.norm(x)
norm2 = vg.normalize(x)
print np.all(norm1 == norm2)
# True
I created the library at my last startup, where it was motivated by uses like this: simple ideas which are way too verbose in NumPy.
Without sklearn and using just numpy.
Just define a function:.
Assuming that the rows are the variables and the columns the samples (axis= 1):
import numpy as np
# Example array
X = np.array([[1,2,3],[4,5,6]])
def stdmtx(X):
means = X.mean(axis =1)
stds = X.std(axis= 1, ddof=1)
X= X - means[:, np.newaxis]
X= X / stds[:, np.newaxis]
return np.nan_to_num(X)
output:
X
array([[1, 2, 3],
[4, 5, 6]])
stdmtx(X)
array([[-1., 0., 1.],
[-1., 0., 1.]])
For a 2D array, you can use the following one-liner to normalize across rows. To normalize across columns, simply set axis=0.
a / np.linalg.norm(a, axis=1, keepdims=True)
If you want all values in [0; 1] for 1d-array then just use
(a - a.min(axis=0)) / (a.max(axis=0) - a.min(axis=0))
Where a is your 1d-array.
An example:
>>> a = np.array([0, 1, 2, 4, 5, 2])
>>> (a - a.min(axis=0)) / (a.max(axis=0) - a.min(axis=0))
array([0. , 0.2, 0.4, 0.8, 1. , 0.4])
Note for the method. For saving proportions between values there is a restriction: 1d-array must have at least one 0 and consists of 0 and positive numbers.
A simple dot product would do the job. No need for any extra package.
x = x/np.sqrt(x.dot(x))
By the way, if the norm of x is zero, it is inherently a zero vector, and cannot be converted to a unit vector (which has norm 1). If you want to catch the case of np.array([0,0,...0]), then use
norm = np.sqrt(x.dot(x))
x = x/norm if norm != 0 else x

SymPy - substitute sybolic entries in a matrix

I have a python function which generates a sympy.Matrix with symbolic entries. It works effectively like:
import sympy as sp
M = sp.Matrix([[1,0,2],[0,1,2],[1,2,0]])
def make_symbolic_matrix(M):
M_sym = sp.zeros(3)
syms = ['a0:3']
for i in xrange(3):
for j in xrange(3):
if M[i,j] == 1:
M_sym = syms[i]
elif M[i,j] == 2:
M_sym = 1 - syms[i]
return M_sym
This works just fine. I get a matrix out, which I can use for all the symbolical calculations I need.
My issue is that now I want to evaluate my matrix at specified parameter-value. Usually I would just use the .subs attribute. However, since the symbols, that are now used as entries in my matrix, were originally defined as temporary elements in a function, I don't know how to call them.
It seems as if it should be possible, since I'm able to perform symbolic calculations.
What I want to do would look something like (following the code above):
M_sym = make_matrix(M)
M_eval = M_sym.subs([(a0,.8),(a1,.3),(a2,.5)])
But all I get is "name 'a0' is not defined".
I'd be super happy if someone out there got a solution!
PS. I'm not just defining the symbols globally, because in the actual problem I don't know how many parameters I have from time to time.
In the general case, I assume you're looking for an n-by-m matrix of symbolic elements.
import sympy
def make_symbolic(n, m):
rows = []
for i in xrange(n):
col = []
for j in xrange(m):
col.append(sympy.Symbol('a%d%d' % (i,j)))
rows.append(col)
return sympy.Matrix(rows)
which could be used in the following way:
make_symbolic(3, 4)
to give:
Matrix([
[a00, a01, a02, a03],
[a10, a11, a12, a13],
[a20, a21, a22, a23]])
once you've got that matrix you can substitute in any values required.
Given that the answer from Andrew was helpful it seems like you might be interested in the MatrixSymbol.
In [1]: from sympy import *
In [2]: X = MatrixSymbol('X', 3, 4)
In [3]: X # completely symbolic
Out[3]: X
In [4]: Matrix(X) # Expand to explicit matrix
Out[4]:
⎡X₀₀ X₀₁ X₀₂ X₀₃⎤
⎢ ⎥
⎢X₁₀ X₁₁ X₁₂ X₁₃⎥
⎢ ⎥
⎣X₂₀ X₂₁ X₂₂ X₂₃⎦
But answering your original question, perhapcs you could get the symbols out of the matrix that you produce?
x12 = X[1, 2]
Symbols are defined by their name. Two symbols with the same name will considered to be the same thing by Sympy. So if you know the name of the symbols you want, just create them again using symbols.
You may also find good use of the zip function of Python.