How to specify GP dependent on non-obsereved random walk - pymc3

I have a cyclical signal I would like to model. I would like to allow the signal to be able to stretch and compress in time, and I do not know the exact profile.
At the moment, I am modelling the phase progression as a random walk, and capturing the cyclical nature by defining the mean likelihood as a sum of sines and cosines on the phase, where the weights on the cosines are parameters to be fitted.
i.e.
y = N(f(phase),sigma) = N(sum_i(a_i*sin(phase) + b_i*cos(phase)),sigma)
(i.e. latex image of above)
This seems to work to some extent, but I would like to change the definition of f so that it does not rely on sums of sin and cos.
I was looking at Gaussian Processes, and thinking that there could be a solution to this there - but I can't figure out how (if it's possible) to define the y in terms of phase when using GP.
There is an example on the pymc github site:
y_obs = pm.gp.GP('y_obs', cov_func=f_cov, sigma=s2_n, observed={'X':X, 'Y':y})
The problem here is that X is defined as observed, while I need to model it as a random variable.
I tried this form:
y_obs = pm.gp.GP('y_obs', X = phase , cov_func=f_cov, sigma=s2_n, observed={ 'Y':y})
But that leads to an error:
File "/home/person/.conda/envs/mcmcx/lib/python3.6/site-packages/pymc3/distributions/distribution.py", line 56, in __init__
raise TypeError("Expected int elements in shape")
I am new to HB/GP/pymc3... and even stackoverflow. Apologies if the question is off.

Related

userWarning pymc3 : What does reparameterize mean?

I built a pymc3 model using the DensityDist distribution. I have four parameters out of which 3 use Metropolis and one uses NUTS (this is automatically chosen by the pymc3). However, I get two different UserWarnings
1.Chain 0 contains number of diverging samples after tuning. If increasing target_accept does not help try to reparameterize.
MAy I know what does reparameterize here mean?
2. The acceptance probability in chain 0 does not match the target. It is , but should be close to 0.8. Try to increase the number of tuning steps.
Digging through a few examples I used 'random_seed', 'discard_tuned_samples', 'step = pm.NUTS(target_accept=0.95)' and so on and got rid of these user warnings. But I couldn't find details of how these parameter values are being decided. I am sure this might have been discussed in various context but I am unable to find solid documentation for this. I was doing a trial and error method as below.
with patten_study:
#SEED = 61290425 #51290425
step = pm.NUTS(target_accept=0.95)
trace = sample(step = step)#4000,tune = 10000,step =step,discard_tuned_samples=False)#,random_seed=SEED)
I need to run these on different datasets. Hence I am struggling to fix these parameter values for each dataset I am using. Is there any way where I give these values or find the outcome (if there are any user warnings and then try other values) and run it in a loop?
Pardon me if I am asking something stupid!
In this context, re-parametrization basically is finding a different but equivalent model that it is easier to compute. There are many things you can do depending on the details of your model:
Instead of using a Uniform distribution you can use a Normal distribution with a large variance.
Changing from a centered-hierarchical model to a
non-centered
one.
Replacing a Gaussian with a Student-T
Model a discrete variable as a continuous
Marginalize variables like in this example
whether these changes make sense or not is something that you should decide, based on your knowledge of the model and problem.

Random variable created with scipy.stats and multiprocessing : Pickle error

I'm no king in python, and recently got in trouble with a modification I made in my code. My algorithm is basically multiple uses of stochastic gradient algorithm and thus needs random variables.
I wanted my code to handle custom random variables and probability distribution. To do so, I modified my code and now I use scipy.stats to draw samples of custom random variables. Basically, I create a random variable with an imposed probability density or a cumulative density, and then draw samples thanks to the inverse function of the cumulative distribution function and some uniform random variable between [0,1].
To make it simple the algorithm runs multiple optimization from different starting point using stochastic gradient algorithm, and thus can be parallelized since the starting points are independent.
Problem is that the random variable created this way can't be pickled
PicklingError: Can't pickle : attribute lookup builtin.instancemethod failed
I don't get the subtility of pickling problems for now, so if you guys can help me solve this following simple illustration of the problem :
RV = scipy.stats.norm();
def Draw(rv,N):
return rv.ppf(np.random.random(N))
pDraw = partial(Draw,RV);
PM = multiprocessing.pool(Processes = 2);
L = PM.map(pDraw,range(1,5));
I've heard of pathos library that do not use the same serialization algorithm (dill), but I would like to avoid this solution (if it is a solution) as it is not included in my python distribution at work... making it install will take a lot of time.

Same do-file, same computer, sometimes different results

I've got a large do-file that calls several sub-do-files, all in the lead-up to the estimation of a custom maximum likelihood model. That is, I have a main.do, which looks like this
version 12
set seed 42
do prepare_data
* some other stuff
do estimate_ml
and estimate_ml.do looks like this
* lots of other stuff
global cdf "normal"
program define customML
args lnf r noise
tempvar prob1l prob2l prob1r prob2r y1l y2l y1r y2r euL euR euDiff scale
quietly {
generate double `prob1l' = $ML_y2
generate double `prob2l' = $ML_y3
generate double `prob1r' = $ML_y4
generate double `prob2r' = $ML_y5
generate double `scale' = 1/100
generate double `y1l' = `scale'*((($ML_y10+$ML_y6)^(1-`r'))/(1-`r'))
generate double `y2l' = `scale'*((($ML_y10+$ML_y7)^(1-`r'))/(1-`r'))
generate double `y1r' = `scale'*((($ML_y10+$ML_y8)^(1-`r'))/(1-`r'))
generate double `y2r' = `scale'*((($ML_y10+$ML_y9)^(1-`r'))/(1-`r'))
generate double `euL' = (`prob1l'*`y1l')+(`prob2l'*`y2l')
generate double `euR' = (`prob1r'*`y1r')+(`prob2r'*`y2r')
generate double `euDiff' = (`euR'-`euL')/`noise'
replace `lnf' = ln($cdf( `euDiff')) if $ML_y1==1
replace `lnf' = ln($cdf(-`euDiff')) if $ML_y1==0
}
end
ml model lf customML ... , maximize technique(nr) difficult cluster(id)
ml display
To my great surprise, when I run the whole thing from top to bottom in Stata 12/SE I get different results for one of the coefficients reported by ml display each time I run it.
At first I thought this was a problem of running the same code on different computers but the issue occurs even if I run the same code on the same machine multiple times. Then I thought this was a random number generator issue but, as you can see, I can reproduce the issue even if I fix the seed at the beginning of the main do-file. The same holds when I move the set seed command immediately above the ml model.... The only way to get the same results though multiple runs is if I run everything above ml model and then only run ml model and ml display repeatedly.
I know that the likelihood function is very flat in the direction of the parameter whose value changes over runs so it's no surprise it can change. But I don't understand why it would, given that there seems to be little that isn't deterministic in my do files to begin with and nothing that couldn't be made deterministic by fixing the seed.
I suspect a problem with sorting. The default behaviour is that if two observations have the same value, they will be sorted randomly. Moreover, the random process that guides this sorting is governed by a different seed. This is intentional, as it prevents users to by accident see consistency where none exist. The logic being that it is better to be puzzled than to be overly confident.
As someone mentioned in the comments to this answer, adding the option stable to my sort command made the difference in my situation.

Stata seems to be ignoring my starting values in maximum likelihood estimation

I am trying to estimate a maximum likelihood model and it is running into convergence problems in Stata. The actual model is quite complicated, but it converges with no troubles in R when it is supplied with appropriate starting values. I however cannot seem to get Stata to accept the starting values I provide.
I have included a simple example below estimating the mean of a poisson distribution. This is not the actual model I am trying to estimate, but it demonstrates my problem. I set the trace variable, which allows you to see the parameters as Stata searches the likelihood surface.
Although I use init to set a starting value of 0.5, the first iteration still shows that Stata is trying a coefficient of 4.
Why is this? How can I force the estimation procedure to use my starting values?
Thanks!
generate y = rpoisson(4)
capture program drop mypoisson
program define mypoisson
args lnf mu
quietly replace `lnf' = $ML_y1*ln(`mu') - `mu' - lnfactorial($ML_y1)
end
ml model lf mypoisson (mean:y=)
ml init 0.5, copy
ml maximize, iterations(2) trace
Output:
Iteration 0:
Parameter vector:
mean:
_cons
r1 4
Added: Stata doesn't ignore the initial value. If you look at the output of the ml maximize command, the first line in the listing will be titled
initial: log likelihood =
Following the equal sign is the value of the likelihood for the parameter value set in the init statement.
I don't know how the search(off) or search(norescale) solutions affect the subsequent likelihood calculations, so these solution might still be worthwhile.
Original "solutions":
To force a start at your initial value, add the search(off) option to ml maximize:
ml maximize, iterate(2) trace search(off)
You can also force a use of the initial value with search(norescale). See Jeff Pitblado's post at http://www.stata.com/statalist/archive/2006-07/msg00499.html.

Using predict for ancillary parameters in maximum likelihood model in Stata

I wanted to know whether I can use the predict option for ancillary parameters (maximum likelihood ) program as follows (I estimated lnsigma and so sigma is the ancillary parameter in the model):
predict lnsigma, eq(lnsigma)
gen sigma=exp(lnsigma)
I also would like to know whether we can use above for heteroscedastic model.
Thank you in advance.
That sounds correct. I would be more explicit by typing predict lnsigma, xb eq(lnsigma). This way your code will not be broken when someone later on desides to write a prediction program for your estimation program and sets the default to something different than the linear prediction.
You can also do it in one line:
predictnl sigma = exp(xb(#2))
This assumes that lnsigma is the second equation in your model. If it is the third equation you replace xb(#2) with xb(#3). predictnl is also also an easy way of using the delta method to predict standard errors and confidence intervals for sigma.
I assume this is your own Stata program. If that is true, then you also have a third option: You can create your own prediction program, which Stata's predict command will recongnize. You can find some useful tricks on how to do that here: http://www.stata.com/help.cgi?_pred_se