Realtime plot with pyqtgraph - python-2.7

I have a problem regarding real-time plotting with pyqtgraph. I want the plot to update itself every 100 items collected from serial input, but the curve shows only once, after gathering the data is finished. Debugging print "boo" gets printed to console after every 100 items, but updatePlot() seems to be called only while ending the loop. This is my code:
class EKG(QtGui.QMainWindow, out.Ui_MainWindow):
def __init__(self):
super(self.__class__, self).__init__()
self.setupUi(self)
self.collectedData = []
self.dataPlot.showGrid(x=True, y=True, alpha=0.6)
self.plt = self.dataPlot.plot(pen='m', antialias=True)
self.port = "COM9"
self.actionZako_cz.triggered.connect(self.close_window)
self.startBtn.clicked.connect(self.collectData)
def getTime(self):
return int(self.timeBox.toPlainText())
def updatePlot(self):
self.plt.setData(self.collectedData)
def collectData(self, howLong):
howLong = self.getTime()
self.collectedData = []
serialData = serial.Serial(self.port, 57600)
t_end = time.time() + howLong
while time.time() < t_end:
try:
self.collectedData.append(int(serialData.readline().strip()))
except ValueError:
pass
if len(self.collectedData) % 100 == 0:
print "boo"
self.updatePlot()
serialData.close()
I would be grateful for any advice; it's the first time i'm using pyqtgraph and I haven't got the hang of it yet...

I've came across the solution, in case anybody stumbles upon a similar problem:
def updatePlot(self):
self.plt.setData(self.collectedData)
QtGui.QApplication.processEvents()
adding a call to process events causes the plot to update properly!

Related

Loading lightgbm model and using predict with parallel for loop freezes (Python)

I have the need to use my model to do predictions in batches and in parallel in python. If I load the model and create the data frames in a regular for loop and use the predict function it works with no issues. If I create disjoint data frames in parallel using multiprocessing in python and then use the predict function the for loop freezes indefinitely. Why does the behavior occur?
Here is a snippet of my code:
with open('models/model_test.pkl', 'rb') as fin:
pkl_bst = pickle.load(fin)
def predict_generator(X):
df = X
print(df.head())
df = (df.groupby(['user_id']).recommender_items.apply(flat_map)
.reset_index().drop('level_1', axis=1))
df.columns = ['user_id', 'product_id']
print('Merge Data')
user_lookup = pd.read_csv('data/user_lookup.csv')
product_lookup = pd.read_csv('data/product_lookup.csv')
product_map = dict(zip(product_lookup.product_id, product_lookup.name))
print(user_lookup.head())
df = pd.merge(df, user_lookup, on=['user_id'])
df = pd.merge(df, product_lookup, on=['product_id'])
df = df.sort_values(['user_id', 'product_id'])
users = df.user_id.values
items = df.product_id.values
df.drop(['user_id', 'product_id'], axis=1, inplace=True)
print('Prediction Step')
prediction = pkl_bst.predict(df, num_iteration=pkl_bst.best_iteration)
print('Prediction Complete')
validation = pd.DataFrame(zip(users, items, prediction),
columns=['user', 'item', 'prediction'])
validation['name'] = (validation.item
.apply(lambda x: get_mapping(x, product_map)))
validation = pd.DataFrame(zip(validation.user,
zip(validation.name,
validation.prediction)),
columns=['user', 'prediction'])
print(validation.head())
def get_items(x):
sorted_list = sorted(list(x), key=lambda i: i[1], reverse=True)[:20]
sorted_list = random.sample(sorted_list, 10)
return [k for k, _ in sorted_list]
relevance = validation.groupby('user').prediction.apply(get_items)
return relevance.reset_index()
This works but is very slow:
results = []
for d in df_list_sub:
r = predict_generator(d)
results.append(r)
This breaks:
from multiprocessing import Pool
import tqdm
pool = Pool(processes=8)
results = []
for x in tqdm.tqdm(pool.imap_unordered(predict_generator, df_list_sub), total=len(df_list_sub)):
results.append(x)
pass
pool.close()
pool.join()
I would be very thankful if someone could help me.
Stumbled onto this myself as well. This is because LightGBM only allows to access the predict function from a single process. The developers explicitly added this logic because it doesn't make sense to call the predict function from multiple processes, as the prediction function already makes use of all CPU's available. Next to that, allowing for multiprocess predicting would probably result in a worse performance. More information about this can be found in this GitHub issue.

Increase recursion limit and stack size in python 2.7

I'm working with large trees and need to increase the recursion limit on Python 2.7.
Using sys.setrecursionlimit(10000) crashes my kernel, so I figured I needed to increase the stack size.
However I don't know how large the stack size should be. I tried 100 MiB like this threading.stack_size(104857600), but the kernel still dies. Giving it 1 GiB throws an error.
I haven't worked with the threading module yet so am I using it wrong when I just put the above statement at the beginning of my script? I'm not doing any kind of parallel processing, everything is done in the same thread.
My computer has 128 GB of physical RAM, running Windows 10, iPython console in Spyder.
The error displayed is simply:
Kernel died, restarting
Nothing more.
EDIT:
Full code to reproduce the problem. The building of the tree works well thought it takes quite long, the kernel only dies during the recursive execution of treeToDict() when reading the whole tree into a dictionary. Maybe there is something wrong with the code of that function. The tree is a non-binary tree:
import pandas as pd
import threading
import sys
import random as rd
import itertools as it
import string
threading.stack_size(104857600)
sys.setrecursionlimit(10000)
class treenode:
# class to build the tree
def __init__(self,children,name='',weight=0,parent=None,depth=0):
self.name = name
self.weight = weight
self.children = children
self.parent = parent
self.depth = depth
self.parentname = parent.name if parent is not None else ''
def add_child(node,name):
# add element to the tree
# if it already exists at the given node increase weight
# else add a new child
for i in range(len(node.children)):
if node.children[i].name == name:
node.children[i].weight += 1
newTree = node.children[i]
break
else:
newTree = treenode([],name=name,weight=1,parent=node,depth=node.depth+1)
node.children.append(newTree)
return newTree
def treeToDict(t,data):
# read the tree into a dictionary
if t.children != []:
for i in range(len(t.children)):
data[str(t.depth)+'_'+t.name] = [t.name, t.children[i].name, t.depth, t.weight, t.parentname]
else:
data[str(t.depth)+'_'+t.name] = [t.name, '', t.depth, t.weight, t.parentname]
for i in range(len(t.children)):
treeToDict(t.children[i],data)
# Create random dataset that leads to very long tree branches:
# A is an index for each set of data B which becomes one branch
rd.seed(23)
testSet = [''.join(l) for l in it.combinations(string.ascii_uppercase[:20],2)]
A = []
B = []
for i in range(10):
for j in range(rd.randint(10,6000)):
A.append(i)
B.append(rd.choice(testSet))
dd = {"A":A,"B":B}
data = pd.DataFrame(dd)
# The maximum length should be above 5500, use another seed if it's not:
print data.groupby('A').count().max()
# Create the tree
root = treenode([],name='0')
for i in range(len(data.values)):
if i == 0:
newTree = add_child(root,data.values[i,1])
oldses = data.values[i,0]
else:
if data.values[i,0] == oldses:
newTree = add_child(newTree,data.values[i,1])
else:
newTree = add_child(root,data.values[i,1])
oldses = data.values[i,0]
result={}
treeToDict(root,result)
PS: I'm aware the treeToDict() function is faulty in that it will overwrite entries because there can be duplicate keys. For this error this bug is unimportant however.
To my experience you have a problem not with stack size, but with an algorithm itself.
It's possible to implement tree traversal procedure without recursion at all. You should implement stack-based depth/breadth first search algorithm.
Python-like pseudo-code might look like this:
stack = []
def traverse_tree(root):
stack.append(root)
while stack:
cur = stack.pop()
cur.do_some_awesome_stuff()
stack.append(cur.get_children())
This approach is incredibly scalable and allows you to deal with any trees.
As further reading you can try this and that.

meaning of setUseAdjustedValues(True) om pyalgotrade

Here is an example of SMA cross strategy, what is the reason we use self.setUseAdjustedValues(True)
and how does it works?
from pyalgotrade import strategy
from pyalgotrade.technical import ma
from pyalgotrade.technical import cross
class SMACrossOver(strategy.BacktestingStrategy):
def __init__(self, feed, instrument, smaPeriod):
strategy.BacktestingStrategy.__init__(self, feed)
self.__instrument = instrument
self.__position = None
# We'll use adjusted close values instead of regular close values.
self.setUseAdjustedValues(True)
self.__prices = feed[instrument].getPriceDataSeries()
self.__sma = ma.SMA(self.__prices, smaPeriod)
def getSMA(self):
return self.__sma
def onEnterCanceled(self, position):
self.__position = None
def onExitOk(self, position):
self.__position = None
def onExitCanceled(self, position):
# If the exit was canceled, re-submit it.
self.__position.exitMarket()
def onBars(self, bars):
# If a position was not opened, check if we should enter a long position.
if self.__position is None:
if cross.cross_above(self.__prices, self.__sma) > 0:
shares = int(self.getBroker().getCash() * 0.9 / bars[self.__instrument].getPrice())
# Enter a buy market order. The order is good till canceled.
self.__position = self.enterLong(self.__instrument, shares, True)
# Check if we have to exit the position.
elif not self.__position.exitActive() and cross.cross_below(self.__prices, self.__sma) > 0:
self.__position.exitMarket()
If you use regular close values, instead of adjusted ones, your strategy may react to price changes that are actually the result of a stock split and not a price change due to regular trading activity.
As I understood and trying to simplify it, suppose a share of price is 100.
-> next day share splits in 1:2 means 2 shares of 50 each. this price change are not due to trading activities, there is not trade involve for lower this price. So setUseAdjustedValues(True) handle this situation.

Animation figure closing when interval is increased (FuncAnimation)/Explainantion of interval in FuncAnimation

I am trying write a program for 1D FDTD wave propagation, everything is fine except the interval keyword argument of FuncAnimation. Whenever I increase the interval from 10 (19 to be precise), the animation figure closes before running (exits as soon as it pops up). Now i can easily slow down the animation using time.sleep, but it would be great if i could understand this. Can somebody please explain me how this interval argument works. Is it, in anyway related to the time required by the function that updates frames which is being called by FuncAnimation? Also what is blit for?
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.animation as animation
def main():
#defining dimensions
xdim=720
time_tot = 500
xsource = xdim/2
#stability factor
S=1
#Speed of light
c=1
epsilon0=1
mu0=1
delta =1 # Space step
deltat = S*delta/c # Time step
Ez = np.zeros(xdim) # Arrays to store Electric field and magnetic field
Hy = np.zeros(xdim)
epsilon = epsilon0*np.ones(xdim) #Permittivity and permeability values.
mu = mu0*np.ones(xdim)
fig , axis = plt.subplots(1,1)
axis.set_xlim(len(Ez))
axis.set_ylim(-3,3)
axis.set_title("E Field")
line, = axis.plot([],[])
def init():
line.set_data([],[])
return line,
def animate(n, *args, **kwargs):
Hy[0:xdim-1] = Hy[0:xdim-1]+(delta/(delta*mu[0:xdim-1]))*(Ez[1:xdim]-Ez[0:xdim-1])
Ez[1:xdim]= Ez[1:xdim]+(delta/(delta*epsilon[1:xdim]))*(Hy[1:xdim]-Hy[0:xdim-1])
#Ez[xsource] = Ez[xsource] + 30.0*(1/np.sqrt(2*np.pi))*np.exp(-(n-80.0)**2/(100))
Ez[xsource]=np.sin(2*n*np.pi/180)
ylims = axis.get_ylim()
if (abs(np.amax(Ez))>ylims[1]): # Scaling axis
axis.set_ylim(-(np.amax(Ez)+2),np.amax(Ez)+2)
line.set_data(np.arange(len(Ez)),Ez)
return line,
ani = animation.FuncAnimation(fig, animate, init_func=init, frames=(time_tot), interval=10, blit=False, repeat =False)
fig.show()
if __name__ == "__main__": main()

Conditional quit multiprocess in python

I'm trying to build a python script that runs several processes in parallel. Basically, the processes are independent, work on different folders and leave their output as text files in those folders. But in some special cases, a process might terminate with a special (boolean) status. If so, I want all the other processes to terminate right away. What is the best way to do this?
I've fiddled with multiprocessing.condition() and multiprocessing.manager, after reading the excellent tutorial by Doug Hellmann:
http://pymotw.com/2/multiprocessing/communication.html
However, I do not seem to understand how to get a multiprocessing process to monitor a status indicator and quit if it takes a special value.
To examplify this, I've written the small script below. It somewhat does what I want, but ends in an exception. Suggestions on more elegant ways to proceed are gratefully welcomed:
br,
Gro
import multiprocessing
def input(i):
"""arbitrary chosen value(8) gives special status = 1"""
if i == 8:
value = 1
else:
value = 0
return value
def sum_list(list):
"""accumulative sum of list"""
sum = 0
for x in range(len(list)):
sum = sum + list[x]
return sum
def worker(list,i):
value = input(i)
list[i] = value
print 'processname',i
if __name__ == '__main__':
mgr = multiprocessing.Manager()
list = mgr.list([0]*20)
jobs = [ multiprocessing.Process(target=worker, args=(list,i))
for i in range(20)
]
for j in jobs:
j.start()
sumlist = sum_list(list)
print sumlist
if sumlist == 1:
break
for j in jobs:
j.join()