The Meijer's G function is a neat instrument to treat multiplication of random variables, and for a work I am conducting on this subject I am trying to use it in Sympy ( since it is not present in Sage or other free programs).
It looks that the "meijerg" packages available in Sympy provide a wide set of instruments
I succeeded to import it together with the relevant package for integrals ("sympy.integrals.meijerint") and could start and do some basic manipulations, like plotting, inverting argument (_flip_g), computing values, etc.
However, notwithstanding my best efforts, I cannot succeed and get Sympy to perform some of the most basic simplifications, for example the "absorbtion" of powers of the argument.
So after defining
b1,b2,b3,b4,b5,d1,d2,d3,d4,d5 = symbols('b1 b2 b3 b4 b5 d1 d2 d3 d4 d5')
a1,a2,a3,a4,a5,c1,c2,c3,c4,c5 = symbols('a1 a2 a3 a4 a5 c1 c2 c3 c4 c5')
y,w,z =symbols('y w z',positive=True)
def G1(x):
return(meijerg([[a1,a2,a3],[a4,a5]], [[b1,b2],[b3,b4]], x))
def G2(x):
return(meijerg([[c1,c2],[c3]], [[d1,d2,d3],[d4]], x))
then asking the integral
Ris=_int0oo(G1(y*x),G2(w*x),x)
Ris
I get (on Jupyter)
and no way to "absorb" the y at denominator.
Instead, if I input
integrate(G1(y*x)*G2(w*x),(x,0,oo))
I get
and the first line is in fact what I would like to get.
So my question is why the absorbtion simplification is not attainable/ how can be attained with any of the instruments in the package (_rewrite1, _guess_expansion, etc.)
---- addendum ---
I realize from the comments that I fell into a newby trap: thanks indeed to Davide for signalling.
However, apart from and before integrating, some basic algebraic manouvres on G like inverting the argument (the "hidden" tool _flip_g), absorbing a power of the argument, rewriting a function as G, and alike, would be much useful.
Any way to properly access them ? if not, it would remain as a kind request to developers to render them utilizable. Thanks
Related
So I am trying to create a sheet to help our HR Department create the emails for new hires. One of the issues is we use a format of First Initial Last Name as our naming scheme, but if you don't check it can double up with common last names. HR usually does not check for previous emails that currently exist.
Basic recreation I am trying to do is this:
Username: IFS(F2<>"", F2, IF(COUNTIF(A:A, D) > 1, E2, D2)
First Choice: LEFT(B2, 1) & B3
Second Choice: B2 & B3
What I want for A2:
So basically if Override is set, i want it to use that. If no override is set, i want to check and see if First Choice is already found in column A, if it is already used then use Second Choice. I keep getting a circular dependency. I even tried having the calculation done in Column G, which works. But once I try and set A2 to G2, it gives the circular dependency error again.
you can outsmart it...
paste in A2 cell:
=ARRAYFORMULA(IF(F2:F<>"", F2:F,
IF(COUNTIFS(IF(F2:F<>"", F2:F, D2:D),
IF(F2:F<>"", F2:F, D2:D), ROW(A2:A), "<="&ROW(A2:A))=1,
IF(F2:F<>"", F2:F, D2:D), E2:E)))
paste in D2 cell:
=ARRAYFORMULA(LOWER(LEFT(B2:B, 1)&C2:C))
paste in E2 cell:
=ARRAYFORMULA(LOWER(B2:B&C2:C))
If you are getting a circular dependency you may just need to change the calculation settings.
Go to File > Spreadsheet Settings > Calculation and switch Iterative Calculation on
Let me know if this doesn't work!
I have a data-frame X which has two categorical features and 41 numerical features. So X has total of 43 features.
Now, I would like to convert the categorical features into numerical levels so they can be used in RandomForest Classifier.
I have done following, where 0 and 1 indicate location of categorical features:
import pandas as pd
X = pd.read_csv("train.csv")
F1 = pd.get_dummies(X.iloc[:, 0])
F2 = pd.get_dummies(X.iloc[:, 1])
Then, I concatenate these two data-frames:
Xnew = pd.concat([F1, F2, X.ix[:, 2:]])
Now, Xnew has 63 features (F1 has 18 and F2 has 4 features, remaining 41 are from X)
Is this correct? Is there a better way of doing the same thing? Do I need to drop the first column from F1 and F2 to avoid collinearity?
Since F1 has 18 levels (not features) and F2 has 4, your result looks correct.
To avoid collinearity, you should better drop one of the columns (for each F1 and F2). Not necessarily the first column. Typically you drop the column with the most common level.
Why the one with the most common level? Think about feature importance. If you drop one column, it has no chance to get its importance estimated. This level (the one you dropped) is like your "base level". Only deviations from the base level can be marked as important or not.
My son is learning how to calculate the formula for a parabola using a directrix and focus point on his Khan Academy course. (a,b) is the focus point, k is the parameter for the directrix as y=k. I wanted to show him a simple way to check his results using Sympy; programming helps hugely in solidifying internal algorithms. Step 1 is clearly to set the equation out.
Parabola = Eq(sqrt((y-k)**2),sqrt((x-a)**2+(y-b)**2))
I first solved this for y, intending then to show how to substitute values and derive the equation, thus:
Y = solve(Parabola,y)
This was in a reasonable form, having collected the 1/(2b-2k) to the outside.
Next, I substituted the value of the focus and directrix into the equation, obtaining the equation y= 1/6*(x**2+16*x+49), which is correct.
He needed next to resolve this in a form (x+c1)(x+c2)+remainder. There does not seem to be a direct way to factor from the equation above into this form, at least not from an hour searching the docs.
Answer = Y[0].subs({a:-8,b:-1,k:-4})
factor(Answer,deep=True)
Of course I understand how to reduce to a square factorization plus remainder; my question is solely whether this is possible in sympy and, if so, how?
A second, perhaps trivial, question is why Sympy returns some factorizations as (constant - x) where (x -constant) is preferred: is there a way of specifying the form?
Thanks for any help, on behalf of my son, to whom I am showing the wonders of Sympy.
The process is usually called "completing the square". It is not implemented as a single SymPy method, but one can use the SymPy equation solver to find the coefficients of such a form of the polynomial:
>>> var('A B C')
>>> solve(Eq(Answer, A*(x-B)**2 + C), [A, B, C])
[(1/6, -8, -5/2)]
So the parabola vertex is at (8, -5/2), and the polynomial can be written as 1/6*(x+8)**2 - 5/2
I have a data set with tens of millions of rows. Several columns on this data represent categorical features. Each level of these features is represented by an alpha-numeric string like "b009d929".
C1 C2 C3 C4 C5 C6 C7
68fd1e64 80e26c9b fb936136 7b4723c4 25c83c98 7e0ccccf de7995b8 ...
68fd1e64 f0cf0024 6f67f7e5 41274cd7 25c83c98 fe6b92e5 922afcc0
I'd like to to be able to use Python to map each distinct level to a number to save memory. So that feature C1's levels would be replaced by numbers from 1 to C1_n, C2's levels would be replaced by numbers from 1 to C2_n...
Each feature has different number of levels, ranging from under 10 to 10k+.
I tried dictionaries with Pandas' .replace() but it gets extremely slow.
What is a fast way to approach this problem?
I figured out that the categorical features values were hashed onto 32 bits. So I ended up reading the file in chunks and applying this simple function
int(categorical_feature_value, 16)
Is there a command in sympy to simplify sinh(x)+cosh(x) to exp(x)? If I issue
from sympy import *
x = Symbol('x')
(sinh(x)+cosh(x)).simplify()
I just get sinh(x)+cosh(x) back, but I want to see exp(x) instead.
Even assuming that the simplify function in sympy was very good, what you suggest may not have worked, because what is "simple" is not rigorously defined.
I think what you want is the functionality present in .rewrite:
In [1]: (sinh(x)+cosh(x)).rewrite(exp)
Out[1]:
x
e
You can use .rewrite for many other transformations including gamma <-> combinatorics and inverse trig <-> logarithms.