trying to use DiscreteUniform as an numpy index - pymc3

I am trying to use pymc3.DiscreteUniform as an index for a numpy 1D array
This worked with pymc (v2) but I am transitioning to pymc3 and code that worked under pymc don't work under pymc3.
import pymc3 as pm
d0 = pm.DiscreteUniform('d0', lower=0, upper=nDens - 1, testval = nDens//2)
pred = np.zeros(len(box.match), np.float64)
for iwvl, amatch in enumerate(box.match):
pred[iwvl] += amatch['intensitySum'][d0]
I get the following error message:
IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

I have found something that works but in involves going into theano and theano.tensor.
`
import pymc3 as pm
with pm.Model() as model:
em0 = pm.Normal('em0', mu=emLog, sigma=0.2)
d0 = pm.DiscreteUniform('d0', lower = 0, upper = nDens - 1, testval = Dindex)
boundNormal = pm.Bound(pm.Normal, lower=0.0)
wght = boundNormal('wght', mu=0.2, sigma=0.1)
pred = np.zeros((nDens, len(box.match)), np.float64)
for iwvl, amatch in enumerate(box.match):
pred[0:,iwvl] += amatch['intensitySum']
xpred = theano.shared(pred, name='p0')
idx = tensor.as_tensor_variable(d0)
predicted = xpred[idx]*10.**em0
nObs = len(box.match)
intensity = np.zeros(nObs, np.float64)
for iwvl in range(nObs):
intensity[iwvl] = box.match[iwvl]['obsIntensity']
sigma = 0.2
Y_obs = pm.Normal('Y_obs', mu=predicted, sigma=wght*intensity, observed=intensity)
trace = pm.sample(tune=20000, draws=100000, target_accept=0.85)`
and then you can work with the trace
it is even possible to make sigma as pm variable

Related

keras custom activation to drop under certain conditions

I am trying to drop the values less than 1 and greater than -1 in my custom activation like below.
def ScoreActivationFromSigmoid(x, target_min=1, target_max=9) :
condition = K.tf.logical_and(K.tf.less(x, 1), K.tf.greater(x, -1))
case_true = K.tf.reshape(K.tf.zeros([x.shape[1] * x.shape[2]], tf.float32), shape=(K.tf.shape(x)[0], x.shape[1], x.shape[2]))
case_false = x
changed_x = K.tf.where(condition, case_true, case_false)
activated_x = K.sigmoid(changed_x)
score = activated_x * (target_max - target_min) + target_min
return score
the data type has 3 dimensions: batch_size x sequence_length x number of features.
But I got this error
nvalidArgumentError: Inputs to operation activation_51/Select of type Select must have the same size and shape. Input 0: [1028,300,64] != input 1: [1,300,64]
[[{{node activation_51/Select}} = Select[T=DT_FLOAT, _class=["loc:#training_88/Adam/gradients/activation_51/Select_grad/Select_1"], _device="/job:localhost/replica:0/task:0/device:GPU:0"](activation_51/LogicalAnd, activation_51/Reshape, dense_243/add)]]
[[{{node metrics_92/acc/Mean_1/_9371}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_473_metrics_92/acc/Mean_1", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
I understand what the problem is; custom activation function cannot find the proper batch size of inputs. But I don't know how to control them.
Can anyone fix this or suggest other methods to replace some of the element values in some conditions?
The error message I got when running your code is:
ValueError: Cannot reshape a tensor with 19200 elements to shape
[1028,300,64] (19737600 elements) for 'Reshape_8' (op: 'Reshape') with
input shapes: [19200], [3] and with input tensors computed as partial
shapes: input[1] = [1028,300,64].
And the problem should be that you cannot reshape a tensor of shape [x.shape[1] * x.shape[2]] to (K.tf.shape(x)[0], x.shape[1], x.shape[2]). This is because their element counts are different.
So the solution is just creating a zero array in right shape.
This line:
case_true = K.tf.reshape(K.tf.zeros([x.shape[1] * x.shape[2]], tf.float32), shape=(K.tf.shape(x)[0], x.shape[1], x.shape[2]))
should be replace with:
case_true = K.tf.reshape(K.tf.zeros([x.shape[0] * x.shape[1] * x.shape[2]], K.tf.float32), shape=(K.tf.shape(x)[0], x.shape[1], x.shape[2]))
or using K.tf.zeros_like:
case_true = K.tf.zeros_like(x)
Workable code:
import keras.backend as K
import numpy as np
def ScoreActivationFromSigmoid(x, target_min=1, target_max=9) :
condition = K.tf.logical_and(K.tf.less(x, 1), K.tf.greater(x, -1))
case_true = K.tf.zeros_like(x)
case_false = x
changed_x = K.tf.where(condition, case_true, case_false)
activated_x = K.tf.sigmoid(changed_x)
score = activated_x * (target_max - target_min) + target_min
return score
with K.tf.Session() as sess:
x = K.tf.placeholder(K.tf.float32, shape=(1028, 300, 64), name='x')
score = sess.run(ScoreActivationFromSigmoid(x), feed_dict={'x:0':np.random.randn(1028, 300, 64)})
print(score)

Bayesian Inference with PyMC3. Compilation error.

The following two codes do a simple bayesian inference in python using PyMC3. While the first code for exponential model compiles and run perfectly fine, the second one for a simple ode model, gives an error. I do not understand why one is working and the other is not. Please help.
Code #1
import numpy as np
import pymc3 as pm
def f(a,b,x,c):
return a * np.exp(b*x)+c
#Generating Data with error
a, b = 5, 0.2
xdata = np.linspace(0, 10, 21)
ydata = f(a, b, xdata,0.5)
yerror = 5 * np.random.rand(len(xdata))
ydata += np.random.normal(0.0, np.sqrt(yerror))
model = pm.Model()
with model:
alpha = pm.Uniform('alpha', lower=a/2, upper=2*a)
beta = pm.Uniform('beta', lower=b/2, upper=2*b)
mu = f(alpha, beta, xdata,0.5)
Y_obs = pm.Normal('Y_obs', mu=mu, sd=yerror, observed=ydata)
trace = pm.sample(100, tune = 50, nchains = 1)
Code #2
import numpy as np
import pymc3 as pm
def solver(I, a, T, dt):
"""Solve u'=-a*u, u(0)=I, for t in (0,T] with steps of dt."""
dt = float(dt) # avoid integer division
N = int(round(T/dt)) # no of time intervals
print N
T = N*dt # adjust T to fit time step dt
u = np.zeros(N+1) # array of u[n] values
t = np.linspace(0, T, N+1) # time mesh
u[0] = I # assign initial condition
for n in range(0, N): # n=0,1,...,N-1
u[n+1] = (1 - a*dt)*u[n]
return np.ravel(u)
# Generating data
ydata = solver(1,1.7,10,0.1)
yerror = 5 * np.random.rand(101)
ydata += np.random.normal(0.0, np.sqrt(yerror))
model = pm.Model()
with model:
alpha = pm.Uniform('alpha', lower = 1.0, upper = 2.5)
mu = solver(1,alpha,10,0.1)
Y_obs = pm.Normal('Y_obs', mu=mu, sd=yerror, observed=ydata)
trace = pm.sample(100, nchains=1)
The error is
Traceback (most recent call last):
File "1.py", line 27, in <module>
mu = solver(1,alpha,10,0.1)
File "1.py", line 16, in solver
u[n+1] = (1 - a*dt)*u[n]
ValueError: setting an array element with a sequence.
Please help.
The error is in this line:
mu = solver(1,alpha,10,0.1)
You are trying to pass alpha as a value, but alpha is a pymc3 distribution. The function solver only works when you provide a number in the second argument.
The code #1 works because this function
def f(a,b,x,c):
return a * np.exp(b*x)+c
returns a number.

numerical integration python

I need to reduce the running time for quad() in python (I am integrating some thousands integrals). I found a similar question in here where they suggested to do several integrations and add the partial values. However that does not improve performance. Any thoughts? here is a simple example:
import numpy as np
from scipy.integrate import quad
from scipy.stats import norm
import time
funcB = lambda x: norm.pdf(x,0,1)
start = time.time()
good_missclasified,_ = quad(funcB, 0,3.3333)
stop = time.time()
time_elapsed = stop - start
print ('quad : ' + str(time_elapsed))
start = time.time()
num = np.linspace(0,3.3333,10)
Lv = []
last, lastG = 0, 0
for g in num:
Lval,x = quad(funcB, lastG, g)
last, lastG = last + Lval, g
Lv.append(last)
Lv = np.array(Lv)
stop = time.time()
time_elapsed = stop - start
print ('10 int : ' + str(time_elapsed))
print(good_missclasified,Lv[9])
quadpy (a project of mine) is vectorized and can integrate a function over many domains (e.g., intervals) at once. You do have to choose your own integration method though.
import numpy
import quadpy
a = 0.0
b = 1.0
n = 100
start_points = numpy.linspace(a, b, n, endpoint=False)
h = (b-a) / n
end_points = start_points + h
intervals = numpy.array([start_points, end_points])
scheme = quadpy.line_segment.gauss_kronrod(3)
vals = scheme.integrate(numpy.exp, intervals)
print(vals)
[0.10050167 0.10151173 0.10253194 0.1035624 0.10460322 0.1056545
0.10671635 0.10778886 0.10887216 0.10996634 0.11107152 0.11218781
0.11331532 0.11445416 0.11560444 0.11676628 0.1179398 0.11912512
0.12032235 0.12153161 0.12275302 0.12398671 0.12523279 0.1264914
0.12776266 0.1290467 0.13034364 0.13165362 0.13297676 0.1343132
0.13566307 0.1370265 0.13840364 0.13979462 0.14119958 0.14261866
0.144052 0.14549975 0.14696204 0.14843904 0.14993087 0.15143771
0.15295968 0.15449695 0.15604967 0.157618 0.15920208 0.16080209
0.16241818 0.16405051 0.16569924 0.16736455 0.16904659 0.17074554
0.17246156 0.17419482 0.17594551 0.17771379 0.17949985 0.18130385
0.18312598 0.18496643 0.18682537 0.188703 0.1905995 0.19251505
0.19444986 0.19640412 0.19837801 0.20037174 0.20238551 0.20441952
0.20647397 0.20854907 0.21064502 0.21276204 0.21490033 0.21706012
0.21924161 0.22144502 0.22367058 0.22591851 0.22818903 0.23048237
0.23279875 0.23513842 0.2375016 0.23988853 0.24229945 0.2447346
0.24719422 0.24967857 0.25218788 0.25472241 0.25728241 0.25986814
0.26247986 0.26511783 0.2677823 0.27047356]

generate N random numbers from a skew normal distribution using numpy

I need a function in python to return N random numbers from a skew normal distribution. The skew needs to be taken as a parameter.
e.g. my current use is
x = numpy.random.randn(1000)
and the ideal function would be e.g.
x = randn_skew(1000, skew=0.7)
Solution needs to conform with: python version 2.7, numpy v.1.9
A similar answer is here: skew normal distribution in scipy However this generates a PDF not the random numbers.
I start by generating the PDF curves for reference:
NUM_SAMPLES = 100000
SKEW_PARAMS = [-3, 0]
def skew_norm_pdf(x,e=0,w=1,a=0):
# adapated from:
# http://stackoverflow.com/questions/5884768/skew-normal-distribution-in-scipy
t = (x-e) / w
return 2.0 * w * stats.norm.pdf(t) * stats.norm.cdf(a*t)
# generate the skew normal PDF for reference:
location = 0.0
scale = 1.0
x = np.linspace(-5,5,100)
plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
p = skew_norm_pdf(x,location,scale,alpha_skew)
# n.b. note that alpha is a parameter that controls skew, but the 'skewness'
# as measured will be different. see the wikipedia page:
# https://en.wikipedia.org/wiki/Skew_normal_distribution
plt.plot(x,p)
Next I found a VB implementation of sampling random numbers from the skew normal distribution and converted it to python:
# literal adaption from:
# http://stackoverflow.com/questions/4643285/how-to-generate-random-numbers-that-follow-skew-normal-distribution-in-matlab
# original at:
# http://www.ozgrid.com/forum/showthread.php?t=108175
def rand_skew_norm(fAlpha, fLocation, fScale):
sigma = fAlpha / np.sqrt(1.0 + fAlpha**2)
afRN = np.random.randn(2)
u0 = afRN[0]
v = afRN[1]
u1 = sigma*u0 + np.sqrt(1.0 -sigma**2) * v
if u0 >= 0:
return u1*fScale + fLocation
return (-u1)*fScale + fLocation
def randn_skew(N, skew=0.0):
return [rand_skew_norm(skew, 0, 1) for x in range(N)]
# lets check they at least visually match the PDF:
plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
p = randn_skew(NUM_SAMPLES, alpha_skew)
sns.distplot(p)
And then wrote a quick version which (without extensive testing) appears to be correct:
def randn_skew_fast(N, alpha=0.0, loc=0.0, scale=1.0):
sigma = alpha / np.sqrt(1.0 + alpha**2)
u0 = np.random.randn(N)
v = np.random.randn(N)
u1 = (sigma*u0 + np.sqrt(1.0 - sigma**2)*v) * scale
u1[u0 < 0] *= -1
u1 = u1 + loc
return u1
# lets check again
plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
p = randn_skew_fast(NUM_SAMPLES, alpha_skew)
sns.distplot(p)
from scipy.stats import skewnorm
a=10
data= skewnorm.rvs(a, size=1000)
Here, a is a parameter which you can refer to:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skewnorm.html
Adapted from rsnorm function from fGarch R package
def random_snorm(n, mean = 0, sd = 1, xi = 1.5):
def random_snorm_aux(n, xi):
weight = xi/(xi + 1/xi)
z = numpy.random.uniform(-weight,1-weight,n)
xi_ = xi**numpy.sign(z)
random = -numpy.absolute(numpy.random.normal(0,1,n))/xi_ * numpy.sign(z)
m1 = 2/numpy.sqrt(2 * numpy.pi)
mu = m1 * (xi - 1/xi)
sigma = numpy.sqrt((1 - m1**2) * (xi**2 + 1/xi**2) + 2 * m1**2 - 1)
return (random - mu)/sigma
return random_snorm_aux(n, xi) * sd + mean

Categorical Mixture Model in Pymc3

I'm new to Pymc3 and I'm trying to create the Categorical Mixture Model shown in https://en.wikipedia.org/wiki/Mixture_model#Categorical_mixture_model . I'm having difficulty hooking up the 'x' variable. I think it's because I have to make the z variable Deterministic, but I'm getting an error message at the line where 'x' is assigned : "ValueError: We expected 3 inputs but got 2.". It looks like the p function only accepts 2 inputs so I'm stuck. I've tried a bunch of different things, but haven't been able to get this to work yet.
import numpy as np
from pymc3 import *
import theano.tensor as t
K = 3 #NUMBER OF TOPICS
V = 20 #NUMBER OF WORDS
N = 15 #NUMBER OF DOCUMENTS
#GENERAETE RANDOM CATEGORICAL MIXTURES
data = np.ones([N,V])
#theano.compile.ops.as_op(itypes=[t.lscalar, t.dscalar, t.dscalar],otypes=[t.dvector])
def p(z=z, phi=phi):
return [phi[z[i,j]] for i in range(D) for j in range(W)]
model = Model()
with model:
alpha = np.ones(V)
beta = np.ones(K)
theta = [Dirichlet('theta_%i' % i, alpha, shape=V) for i in range(K)]
phi = Dirichlet('phi', beta, shape=K)
z = [Categorical('z_%i' % i, p = phi, shape=V) for i in range(N)]
x = [Categorical('x_%i_%i' % (i,j), p=p(z[i][j],phi), observed=data[i,j]) for i in range(N) for j in range(V)]
#x = [Categorical('x_%i_%i' % (i,j), p=theta[z[i][j]], observed=data[i,j]) for i in range(N) for j in range(V)]
print "Created model. Now begin sampling"
step = Slice()
trace = sample(n, step)
trace.get_values('phi')
For starters, in your example above, z and phi have no value which would allow them to be used as default values. We also don't have values for D and W.
As for the number of arguments, the function you define has 2 but your theano decorator above it has 3. I'd suggest
#theano.compile.ops.as_op(itypes=[t.lscalar, t.dvector],otypes=[t.dvector])
def p(z, phi):
return [phi[z[i,j]] for i,j in zip(range(D),range(W))]