Importing scipy stats modules to get expected value of empirical distribution - python-2.7

I want to fit every member of a list of data sets to a lognormal distribution. Then, I want to calculate the expected value of a function over each distribution. I've tried the following code and get the following error.
Code
from numpy import *
from scipy.stats import lognorm
dists = map(lognorm,data)
expectations = [dist.expect(r_[1,1],zeros(40,)) for dist in dists]
Error
AttributeError: 'rv_frozen' object has no attribute 'expect'
Perhaps I'm reading the documentation incorrectly, I though because expect is a method of lognormal it is available to frozen distributions.
What is the right way to call the methods such as 'expect' from a frozen distribution?

see thread at
http://mail.scipy.org/pipermail/scipy-user/2012-August/032860.html
expect is not yet connected to frozen distributions. Either, use a distribution that is not frozen or use a helper function like
def expect(X, f, lb, ub):
if hasattr(X, 'dist'):
return X.dist.expect(f, lb = lb, ub = ub)
else:
return X.expect(f, lb = lb, ub = ub)
update:
Besides the problem with the frozen distribution, you need to check the methods of the distributions.
You need to use .fit(data, ...) to estimate the parameters.
You can calculate an expected value of a function using expect, the signature is here http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.expect.html?highlight=expect#scipy.stats.rv_continuous.expect
Default of expect is the identity mapping that calculates the mean. But you can also get the mean directly form the distribution using either the .mean or the .stats method. This avoids the integration if there is an explicit expression for the mean.

if you look at the Scjipy Frozen Object, you see that expect is no method of it.
Try :
from numpy import *
from scipy.stats import lognorm
dists = map(lognorm,data)
expectations = [ lognorm.expect( func, s, loc ) for dist in dists]
( I do not know the functions options )

Related

module 'pymc3' has no attribute 'traceplot' Error

I'm trying to generate a trace plot of my model but it shows module 'pymc3' has no attribute 'traceplot' error. My code is:
with pm.Model() as our_first_model:
# a priori
theta = pm.Beta('theta', alpha=1, beta=1)
# likelihood
y = pm.Bernoulli('y', p=theta, observed=data)
#y = pm.Binomial('theta',n=n_experimentos, p=theta, observed=sum(datos))
start = pm.find_MAP()
step = pm.Metropolis()
trace = pm.sample(1000, step=step, start=start)
burnin = 0 # no burnin
chain = trace[burnin:]
pm.traceplot(chain, lines={'theta':theta_real});
which then gives the following error:
AttributeError Traceback (most recent call last)
<ipython-input-8-40f97a342e0f> in <module>
1 burnin = 0 # no burnin
2 chain = trace[burnin:]
----> 3 pm.traceplot(chain, lines={'theta':theta_real});
AttributeError: module 'pymc3' has no attribute 'traceplot'
I'm on windows10 and I've downloaded pymc3 with pip since it was not included in anaconda that I've downloaded.
Since several versions ago, PyMC3 delegates plotting and stats to ArviZ, and the original plotting commands were kept as alias to ArviZ methods for convenience and ease of transition.
Latest PyMC3 release (3.11.0) is the first to not include the alias such as pm.traceplot. You have to use arviz.plot_trace which works with PyMC3 objects.
Extra notes unrelated to the question itself:
You are using pm.find_MAP to initialize the chain and you are manually setting the sampler to pm.Metropolis instead of allowing pm.sample to select its own defaults. There are reasons to do so and it's not intrinsically wrong but it is discourged, see PyMC3 FAQs.
PyMC3 is transitioning to using InferenceData as default output of pm.sample. I would recommend setting return_inferencedata=True in pm.sample for the following reasons: 1) ArviZ functions convert to this format under the hood, you will avoid this small overhead, 2) InferenceData has more capabilities than MultiTrace, 3) PyMC3 is transitioning to InferenceData as the default output of pm.sample so why not get started already?
You have a # no burn-in comment, however, the trace returned by pm.sample has already had a burn-in performed of length the tune parameter passed to it. The default value of tune is 1000. To actually get all the samples and see how the MCMC slowly converges to the typical set, you need to use discard_tuned_samples=False.
Some InferenceData resources:
InferenceData overview: https://arviz-devs.github.io/arviz/getting_started/XarrayforArviZ.html
Working with InferenceData examples (shows how to perform burn-in among other things): https://arviz-devs.github.io/arviz/getting_started/WorkingWithInferenceData.html

Matplotlib Qt4 GUI programming - replace plt.figure() with OO equivalent

I have an App made using Qt4 Designer which inserts a matplotlib figure into a container widget.
The code to generate the figure comes from another module, obspy:
self.st.plot(fig = self.rawDataPlot)
https://docs.obspy.org/packages/autogen/obspy.core.stream.Stream.plot.html
Normally, this would create and show a matplotlib figure for the st object's data, which is time-series. When the fig parameter is specified this tells self.st.plot to plot to an existing matplotlib figure instance.
The code I have to generate the figure and then position it in my GUI widget is:
def addmpl(self, fig, layout, window): # code to add mpl figure to Qt4 widget
self.canvas = FigureCanvas(fig)
layout.addWidget(self.canvas)
self.canvas.draw()
self.toolbar = NavigationToolbar(self.canvas,
window, coordinates=True)
layout.addWidget(self.toolbar)
self.rawDataPlot = plt.figure() # code to create a mpl figure instance
self.st.plot(fig = self.rawDataPlot) # plot time-series data to existing matplotlib figure instance
self.addmpl(self.rawDataPlot, self.mplvl, self.mplwindow) # add mpl figure to Qt4 widget
What I want to do is instantiate a matplot figure (for use by self.st.plot) but in a way which avoids using plt.figure(), as I have read that this is bad practice when using object-oriented programming.
If I replace plt.figure() with Figure() (from matplotlib.figure.Figure()) I get an error:
AttributeError: 'NoneType' object has no attribute 'draw'
As it stands, the App runs fine if I use plt.figure(), but is there a clean way to avoid using is and is it even necessary for my case?
PS, the code snippets here are taken from a larger source, but I think it gets the point across..
In principle both methods should work.
Whether you set self.rawDataPlot = plt.figure() or self.rawDataPlot = Figure() does not make a huge difference, assuming the imports are correct.
So the error is most probably triggered within the self.st.plot() function. (In general, if you report errors, append the traceback.)
Looking at the source of obspy.core.stream.Stream.plot there is a keyword argument
:param draw: If True, the figure canvas is explicitly re-drawn, which
ensures that existing figures are fresh. It makes no difference
for figures that are not yet visible.
Defaults to True.
That means that apparently the plot function tries to draw the canvas, which in the case of providing a Figure() hasn't yet been set.
A good guess would therfore be to call
self.rawDataPlot = Figure()
self.st.plot(fig = self.rawDataPlot, draw=False)
and see if the problem persists.

Equivalent of tf.identity with control dependency for an operation node

I am writing a wrapper class that takes a generic graph with a special member "train_op" to manage the training, saving, and housekeeping of my model.
I wanted to cleanly keep track of the lifetime number of training steps like so:
with tf.control_dependencies([ step_add_one ]):
self.train_op=tf.identity(self.training_graph.train_op )
raise TypeError('Expected binary or unicode string, got %r'
e, is_training=True, inputs=None)
I think the rub here is that train_op is the return of tf.Optimizer.minimize(), so it is not a tensor per se, but an operation.
An obvious workaround would be to call tf.identity on the training_graph.loss, but I lose a bit of abstraction because I have to then handle the learning rate etc externally. Moreover, I feel like I'm missing something.
How can I best remedy this?
You can use tf.group(), which will work with operations and tensors.
For instance:
x = tf.Variable(1.)
loss = tf.square(x)
optimizer = tf.train.GradientDescentOptimizer(0.1)
train_op = optimizer.minimize(loss)
step = tf.Variable(0)
step_add_one = step.assign_add(1)
with tf.control_dependencies([step_add_one]):
train_op_2 = tf.group(train_op)
Now when you run train_op_2, the value of step will be incremented.
However, the best way to go (if you can modify the graph that created the graph) is to add a parameter global_step to the minimize function:
train_op = optimizer.minimize(loss, global_step=step)

Overwriting Existing Python Plots with New Function Call

I would like to overwrite an existing plot I made in python with a new function call. I would like to produce a plot, look at it, then call the same function again with different arguments to produce another plot. I would like the second plot to replace the first plot. That is, I don't want two figure windows open; just the original window overwritten.
I have tried using interactive mode when plotting (ion()), placing plot() and show() calls in different places, and clearing figures. The problems I have are that: 1. I can never overwrite and existing window, I always open more 2. show() blocks the code from continuing and I am unable to perform the 2nd function call 3. I use interactive mode and the window appears for a second before going away
What I'm trying to do seems simple enough, but I'm having great difficulty. Any help is appreciated.
Easiest solution
There are many ways to do this, the easiest of which is to reset the plot's Line2D using its set_ydata(...) method and pyplot.pause. There are versions of matplotlib (<0.9, I believe) which don't have pyplot.pause, so you may need to update yours. Here's a simple minimal working example:
import numpy as np
from matplotlib import pyplot as plt
ph, = plt.plot(np.random.rand(100))
def change_plot():
ph.set_ydata(np.random.rand(100))
plt.pause(1)
while True:
change_plot()
Other approaches
Using pyplot.ion and pyplot.ioff, as detailed here. I tend to use these when I'm doing exploratory data analysis with a Python shell.
Using the matplotlib.animation package, as detailed in this very comprehensible example. This is a much more robust approach than the easy solution above, and permits all kinds of useful/fun options, such as outputting the animation to a video file, etc.
Instead of using the set_ydata method of the Lines object, you can always clear the axes (pyplot.cla()) and call the plotting command again. For example, if you are using pyplot.contour, the returned QuadContourSet has no set_zdata method, but this will work:
import numpy as np
from matplotlib import pyplot as plt
X,Y = np.meshgrid(np.arange(100),np.arange(100))
def change_plot():
Z = np.random.random(X.shape)
plt.cla()
ph = plt.contour(X,Y,Z)
plt.pause(1)
while True:
change_plot()
write your plotting function like
def my_plotter(ax, data, style):
ax.cla()
# ax.whatever for the rest of your plotting
return artists_added
and then call it like
data = some_function()
arts = my_plotter(plt.gca(), data, ...)
or do
fig, ax = plt.subplots()
and then call your plotting function like
arts = my_plotter(ax, data, ...)
I had almost the same issue and I solved it by assigning a name for each plot.
def acc(train_acc, test_acc, savename):
plt.figure(savename) # If you remove this line, the plots will be added to the same plot. But, when you assign a figure, each plot will appear in a separate plot.
ep = np.arange(len(train_acc)) + 1
plt.plot(ep, train_acc, color="blue", linewidth=1, linestyle="-", label="Train")
plt.plot(ep, test_acc, color="red", linewidth=1, linestyle="-", label="Test")
plt.title("NDCG")
plt.xlabel("Iteration")
plt.ylabel("NDCG#K")
plt.legend(loc='lower right')
plt.savefig(savename)

dynamic graph using matplotlib

import matplotlib.pyplot as plt
import matplotlib.animation as animation
import time
fig=plt.figure()
ax1=fig.add_subplot(1,1,1)
def animate():
data = np.loadtxt("new.txt")
ax1.plot(data[:,0], data[:,1])
return
ani=animation.FuncAnimation(fig,animate,frames=1000)
plt.show()
the error poping up is
TypeError: animate() takes no argument(1 given)
what to do??
There seem to be two errors in your code:
The file you are trying to read does really not exist or is not readable by your script. (That is why you get the error message.)
As Hima correctly says, the FuncAnimation expects to receive a function (i.e. animate) whereas animate() is the return value of the function.
Then there is one thing you should consider. If your data is in a simple file, you might try using the numpy.loadtxt to read that file. In that case your function animate would be something like:
import numpy as np
def animate():
data = np.loadtxt("myfile.txt")
ax1.plot(data[:,0], data[:,1])
However, even after that you will end up with an increasing number of plots, because every plot command creates a new line. Instead you might want to have a look at the set_xdata method. (As I guess this is a school assignment, I won't give the full solution.)