Unexpected result of a substitution inside a function argument - sympy

In [4]: from sympy import pi, sin, solve, symbols
...: t, w0 = symbols('t omega_0')
...: wb = 6*w0
...: f = sin(wb*t)
...: display(f)
...: t1 = solve(wb*t-pi, t)
...: display(t1)
...: f1 = f.subs({t:t1})
...: display(f1)
I expected that the value of the substitution, because there are no more free variables, would be sin(pi) or, even better, 0.
Obviously I'm missing something...
As exposed by Oscar Benjamin in the answer below t1 is a list — as a matter of fact, I need a coffee.

I get an error when I run this code with the latest version of SymPy:
In [4]: f
Out[4]: sin(6⋅ω₀⋅t)
In [5]: t1 = solve(wb*t-pi, t)
In [6]: t1
⎡ π ⎤
In [7]: f1 = f.subs({t:t1})
SympifyError: [pi/(6*omega_0)]
That's because t1 is a list. Note that a list is returned because in general an equation to be solved can have more than one solution. You should substitute t for the item in the list which you can refer to as t1[0]:
In [8]: f1 = f.subs({t:t1[0]})
In [9]: f1
Out[9]: 0
Alternatively I recommend using solve with the dict=True argument so that it always returns a list of dicts. Each dict can be used directly with subs:
In [10]: t1 = solve(wb*t-pi, t, dict=True)
In [11]: t1
⎡⎧ π ⎫⎤
⎢⎨t: ────⎬⎥
⎣⎩ 6⋅ω₀⎭⎦
In [12]: f.subs(t1[0])
Out[12]: 0


Sympy: Is it possible use function collect() to IndexedBase variables?

I'm trying to use the function collect() to simplify mi expression . My desired result is
My code:
from sympy import *
i = symbols('i' , integer = True )
a = symbols( 'a' )
alpha = IndexedBase('alpha', positive=True, domain=QQ)
index = (i, 1, 3)
rho = symbols( 'rho')
U = product( alpha[i]**(1/(rho-1)) , index )
My solution attempt:
U = U.subs(1/(rho-1),a)
collect(U,rho, evaluate=False)[1]
What I'm doing wrong?
You must be using a fairly old version of SymPy because in recent versions the form that you wanted arises automatically. In any case you should be able to use powsimp:
In [9]: U
a a a
alpha[1] ⋅alpha[2] ⋅alpha[3]
In [10]: powsimp(U, force=True)

Why doesn't N('abs(2)') simplify to 2 in Sympy 1.5.1?

This is my code:
from sympy import *
from sympy.parsing.sympy_parser import parse_expr
x, y, z, t = symbols('x y z t')
It returns abs(2) instead of 2 running on Jupyter Notebook on Anaconda. Isn't N() meant to evaluate numerical expressions?
I thought that when you give N() a string, it parses automatically, but just in case I checked:
expr = parse_expr('abs(2)')
This again returns abs(2)
The function is called Abs in sympy. What you get back from parse_expr is an arbitrary function that just happens to be called abs:
In [8]: parse_expr('f(2)')
Out[8]: f(2)
In [9]: parse_expr('abs(2)')
Out[9]: abs(2)
In [10]: parse_expr('Abs(2)')
Out[10]: 2

vectorizer.fit_transform gives NotImplementedError : adding a nonzero scalar to a sparse matrix is not supported

I am trying to create a term document matrix using my custom analyser to extract features out of the documents. Following is the code for the same :
vectorizer = CountVectorizer( \
def customAnalyzer(text):
grams = analyzer(text)
tgrams = [gram for gram in grams if not re.match("^[0-9\s]+$",gram)]
return tgrams
This function is called to create the custom analyser, which is used by the countVectorizer to extract the features.
for i in xrange( 0, num_rows ):
clean_query.append( review_to_words( inp["keyword"][i] , units))
vectorizer = CountVectorizer(analyzer = customAnalyzer, \
tokenizer = None, \
ngram_range=(1,2), \
preprocessor = None, \
stop_words = None, \
max_features = n,
features = vectorizer.fit_transform(clean_query)
z = vectorizer.get_feature_names()
This call throws the following error:
(<type 'exceptions.NotImplementedError'>, 'python.py', 128,NotImplementedError('adding a nonzero scalar to a sparse matrix is not supported',))
This error comes when we call the vectorizer to fit and transform.
But the value of the variable clean_query is not scalar. I am using sklearn-0.17.1
This is a small test which I did to reproduce the error, but it did not throw the same error for me. (This example has been taken from : scikit-learn Feature extraction)
scikit-learn version : 0.19.dev0
In [1]: corpus = [
...: ... 'This is the first document.',
...: ... 'This is the second second document.',
...: ... 'And the third one.',
...: ... 'Is this the first document?',
...: ... ]
In [2]: from sklearn.feature_extraction.text import TfidfVectorizer
In [3]: vectorizer = TfidfVectorizer(min_df=1)
In [4]: vectorizer.fit_transform(corpus)
<4x9 sparse matrix of type '<type 'numpy.float64'>'
with 19 stored elements in Compressed Sparse Row format>
In [5]: import numpy as np
In [6]: np.isscalar(corpus)
Out[6]: False
In [7]: type(corpus)
Out[7]: list
From the code above you can see, corpus is not a scalar and has the type list.
I think your solution lies in creating the clean_query variable, as expected by the vectorizer.fit_transform function.

Binomial iterated expectation in Cython

I am brand new to Cython. How to convert the Python function called Values below to Cython? With factors=2 and i=60 this takes 2.8 secs on my big Linux box. The goal is sub 1 sec with factors=2 and i=360.
Here's the code. Thanks!
import numpy as np
import itertools
class Numeraire:
def __init__(self, rate):
self.rate = rate
def __call__(self, timenext, time, state):
return np.exp(-self.rate*(timenext - time))
def Values(values, i1, i0=0, numeraire=Numeraire(0.)):
for i in np.arange(i1-1, i0-1, -1):
for j in itertools.product(np.arange(i+1), repeat=factors):
value = 0.
for k in itertools.product(np.arange(2), repeat=factors):
value += values[tuple(np.array(j) + np.array(k))]
values[j] = value*norm*numeraire(i+1, i, j)
return values
factors = 2
i = 60
values = np.ones([i+1]*factors)
Values(values, i, numeraire=Numeraire(0.05/12))
print values[(0,)*factors], np.exp(-0.05/12*i)
Here's my latest answer (no Cython!), which runs in 125 msec for the factor=2, i=360 case.
import numpy as np
import itertools
slices = (slice(None, -1, None), slice(1, None, None))
def Expectation(values, numeraire, i, i0=0):
def Values(values, i):
factors = values.ndim
expect = np.zeros((i,)*factors)
for j in itertools.product(slices, repeat=factors):
expect += values[j]
return expect*0.5**factors*numeraire(i, i-1)
return reduce(Values, range(i, i0, -1), values)
class Numeraire:
def __init__(self, factors, rate=0):
self.factors = factors
self.rate = rate
def __call__(self, timenext, time):
return np.full((time+1,)*factors, np.exp(-self.rate*(timenext - time)))
factors = 2
i = 360
values, numeraire = np.ones((i+1,)*factors), Numeraire(factors, 0.05/12)
%timeit Expectation(values, numeraire, i)
Expectation(values, numeraire, i)[(0,)*factors], np.exp(-0.05/12*i)
Before using Cython, you should optimize your code with Numpy. Here, vectorizing the third and second inner for loops, yields a x40 speed-up,
In [1]: import numpy as np
...: import itertools
...: # define Numaire and Values functions from the question above
...: def Values2(values, i1, i0=0, numeraire=Numeraire(0.)):
...: factors=len(values.shape)
...: norm=0.5**factors
...: k = np.array(list(itertools.product(np.arange(2), repeat=factors)))
...: for i in np.arange(i1-1, i0-1, -1):
...: j = np.array(list(itertools.product(np.arange(i+1), repeat=factors)))
...: mask_all = j[:,:,np.newaxis] + k.T[np.newaxis, :, :]
...: mask_x, mask_y = np.swapaxes(mask_all, 2, 1).reshape(-1, 2).T
...: values_tmp = values[mask_x, mask_y].reshape((j.shape[0], k.shape[0]))
...: values_tmp = values_tmp.sum(axis=1)
...: values[j[:,0], j[:,1]] = values_tmp*norm*numeraire(i+1, i, j)
...: return values
...: factors = 2
...: i = 60
...: values = lambda : np.ones([i+1]*factors)
...: print values()[(0,)*factors], np.exp(-0.05/12*i)
...: res = Values(values(), i, numeraire=Numeraire(0.05/12))
...: res2 = Values2(values(), i, numeraire=Numeraire(0.05/12))
...: np.testing.assert_allclose(res, res2)
...: %timeit Values(values(), i, numeraire=Numeraire(0.05/12))
...: %timeit Values2(values(), i, numeraire=Numeraire(0.05/12))
1.0 0.778800783071
1 loops, best of 3: 1.26 s per loop
10 loops, best of 3: 31.8 ms per loop
The next step would be to replace the line,
j = np.array(list(itertools.product(np.arange(i+1), repeat=factors)
with it's Numpy equivalent, taken from this answer (not very pretty),
def itertools_product_numpy(some_list, some_length):
return some_list[np.rollaxis(
np.indices((len(some_list),) * some_length), 0, some_length + 1)
.reshape(-1, some_length)]
k = itertools_product_numpy(np.arange(i+1), factors)
this result in an overall x160 speed up and the code runs in 1.2 second on my laptop for i=360 and factors = 2.
In this last version, I don't think that you will get much speed up, if you port it to Cython, since there is just one loop remaining and it has only ~360 iterations. Rather, some fine-tuned Python/Numpy optimizations should be performed to get a further speed increase.
Alternatively, you can try applying Cython to your original implementation. However because it is based on itertools.product, which is slow when called repeatedly in a loop, Cython will not help there.

Converting a Set to Dict in python

I have a set like this.
x = set([u'[{"Mychannel":"sample text"},"p"]'])
I need to convert it into Dict.
I need to get output as
x = {'mychannel':'sampletext'}
How to do this.
It looks like you can unpack that crazy thing like this:
>>> x = set([u'[{"Mychannel":"sample text"}, "p"]'])
>>> lst = list(x)
>>> lst
[u'[{"Mychannel":"sample text"}, "p"]']
>>> lst[0]
u'[{"Mychannel":"sample text"}, "p"]'
>>> inner_lst = eval(lst[0])
>>> inner_lst
[{'Mychannel': 'sample text'}, 'p']
>>> d = inner_lst[0]
>>> d
{'Mychannel': 'sample text'}
However, as #MattDMo suggests in comments, I seriously suggest you re-evaluate this data structure, if not at least to factor out the step where you need eval to use it!