I have a small project that uses matplotlib to display a wafer map of die. I am "compiling" the single-file Python (2.7) into an executable using PyInstaller with the --onefile option, so that non-Python users at the company can execute it in Windows.
The executable takes quite a while to load, up to 15s. As a workaround, I removed all the wafer-map plotting capabilities of the program and built a "Lite" version. This Lite version runs in <1s, as it should. In addition, the Lite version's .exe is 85% smaller (as expected).
So it looks like the Matplotlib stuff is bloating the exe and is making it take a long time to load.
Here's my thought process:
I should be able to get the file size down and decrease the load time if I only import the modules I use rather than all of matplotlib.pyplot. I assume that the import matplotlib.pyplot as pyplot line is importing a whole bunch of extra stuff that I'm not using, such as scatterplots.
Here's my question:
How can I only import the parts of matplotlib that I use?
Here's my (relevant) code, with a lot of the fluff (like line colors) removed. Also, please ignore the lack of PEP8 conformity - this was written before I decided to follow it :-)
from __future__ import print_function
import math
import matplotlib.pyplot as pyplot
import matplotlib.patches
fig = pyplot.figure(1)
ax = fig.add_subplot(111, aspect='equal')
ax.axis([xAxisMin, xAxisMax, yAxisMin, yAxisMax])
die = matplotlib.patches.Rectangle(coords, dieX, dieY)
ax.add_patch(die)
arc = matplotlib.patches.Arc((0, 0),
width=exclDia, height=exclDia, angle=-90,
theta1=ang, theta2=-ang)
flat = matplotlib.lines.Line2D([-flatX, flatX],
[flatY, flatY])
# Extra code that actually adds everything to the figure
fig.show()
So it looks like I'm using only:
matplotlib.pyplot.figure
matplotlib.patches.Rectangle
matplotlib.patches.Arc
matplotlib.lines.Line2D
However, those above are not individual modules in matplotlib (to my knowledge) - they are classes of their parent module (patches, lines, pyplot), so I can't just `import matplotlib.patches.Arc' or anything.
So. What's my next step?
Related
This might be a general question. I'm modifying a Python code wrote by former colleague. The main purpose of the code is
Read some file from local
Pop out a GUI to do some modification
Save the file to local
The GUI is wrote with Python and Tkinter. I'm not very familiar with Tkinter actually. Right now, I want to implement an auto-save function, which runs alongside Tkinter's mainloop(), and save modified files automatically for every 5 minutes. I think I will need a second thread to do this. But I'm not sure how. Any ideas or examples will be much appreciated!! Thanks
Just like the comment says, use 'after' recursion.
import Tkinter
root = Tkinter.Tk()
def autosave():
# do something you want
root.after(60000 * 5, autosave) # time in milliseconds
autosave()
root.mainloop()
Threaded solution is possible too:
import threading
import time
import Tkinter
root = Tkinter.Tk()
def autosave():
while True:
# do something you want
time.sleep(60 * 5)
saver = threading.Thread(target=autosave)
saver.start()
root.mainloop()
before leaving I use sys.exit() to kill all running threads and gui. Not sure is it proper way to do it or not.
Module OTS
from Tkinter import *
import openTableApiGet
#bunch of code blah blah blah
openTableApiGet.main() # call to the method in the OpenTableApiGet module
Module OpenTableApiGet
import OTS
class Parser:
#Bunch of code in the class doing stuff
def main():
#bunch of code
#The main method the complier says this module
#doesn't have. Outside of `the Parser class,
#just hanging out on its own
Why is this happening? is the circular import to blame I'd rather keep it but If I must change it I will. I need to write more to make stackoverflow happy so I hope you find tacos in your life soon!
Thanks everyone
You are correct, the circular import makes the code you've shown fail in some circumstances. Specifically, if outside code imports the OpenTableApiGet module first, the OTS module will fail when it tries to call OpenTableApiGet.main().
Here's why. When Python loads a module, it starts at the top and runs each statement sequentially. When it comes to an import statement, it may have to pause the execution of that module in order to load another module.
Here's an example:
A.py:
print 1
import B
print 3
B.py
print 2
These two simple will print the numbers 1-3 in order when you import A.
An import statement doesn't always pause though. If the module to be imported is already loaded (or in the process of being loaded in a circular import situation) Python will not load it again. It will just take a reference to the existent module object and put it into the importing namespace.
C.py:
print 1
import B
print 3
import B
print 4
Nothing will be printed when the second import statement is run, since the B module was already loaded (by the first import statement).
Here's a simple version of your modules that shows the issue with circular imports:
D.py:
print 1
import E
print 5 # this doesn't get a chance to run, nor the code below
x = 7
print 6
E.py:
print 2
import D
print 3
print D.x # this causes an exception, since D doesn't have an x attribute yet
print 4
If you import D, you'll get 1-3 printed and then an exception when the code in E tries to access a global variable in D that hasn't been initialized yet. Note that 5 does not get printed before the exception, as D's execution is paused waiting for E to finish loading.
There are a few ways to fix the code.
First a bad fix. There won't be an exception if you import E first, rather than D (though you'll get some of the numbers printed out of order). I don't recommend relying on this as a solution though, as if you change some imports around in later code it may break again and be very confusing!
Often the "best" approach is to reorganize the code to eliminate the circular dependencies between your modules. Either move some code from one module to the other or factor it out into a third module that both of them can import. This approach may be very vigorously advocated by programmers who learned programming with other languages where circular dependencies are always broken, but it's not nearly as big of an issue in Python.
Another option is to allow the circular import to stay, but simply to avoid doing too much stuff at the top level of the module. Often you can put the troublesome code into a function (that's called by code outside the module) and it will work despite the circular imports. If don't have any top level code that tries to access the contents of the other module, circular imports are not a problem, since not much actually gets run until the imports are complete and all the modules have been fully loaded.
Here's an example of that:
F.py:
print 2
import D
print 3
def foo():
print D.x # not at top level any more
print 4
main.py
import D, F
F.foo()
I would like to overwrite an existing plot I made in python with a new function call. I would like to produce a plot, look at it, then call the same function again with different arguments to produce another plot. I would like the second plot to replace the first plot. That is, I don't want two figure windows open; just the original window overwritten.
I have tried using interactive mode when plotting (ion()), placing plot() and show() calls in different places, and clearing figures. The problems I have are that: 1. I can never overwrite and existing window, I always open more 2. show() blocks the code from continuing and I am unable to perform the 2nd function call 3. I use interactive mode and the window appears for a second before going away
What I'm trying to do seems simple enough, but I'm having great difficulty. Any help is appreciated.
Easiest solution
There are many ways to do this, the easiest of which is to reset the plot's Line2D using its set_ydata(...) method and pyplot.pause. There are versions of matplotlib (<0.9, I believe) which don't have pyplot.pause, so you may need to update yours. Here's a simple minimal working example:
import numpy as np
from matplotlib import pyplot as plt
ph, = plt.plot(np.random.rand(100))
def change_plot():
ph.set_ydata(np.random.rand(100))
plt.pause(1)
while True:
change_plot()
Other approaches
Using pyplot.ion and pyplot.ioff, as detailed here. I tend to use these when I'm doing exploratory data analysis with a Python shell.
Using the matplotlib.animation package, as detailed in this very comprehensible example. This is a much more robust approach than the easy solution above, and permits all kinds of useful/fun options, such as outputting the animation to a video file, etc.
Instead of using the set_ydata method of the Lines object, you can always clear the axes (pyplot.cla()) and call the plotting command again. For example, if you are using pyplot.contour, the returned QuadContourSet has no set_zdata method, but this will work:
import numpy as np
from matplotlib import pyplot as plt
X,Y = np.meshgrid(np.arange(100),np.arange(100))
def change_plot():
Z = np.random.random(X.shape)
plt.cla()
ph = plt.contour(X,Y,Z)
plt.pause(1)
while True:
change_plot()
write your plotting function like
def my_plotter(ax, data, style):
ax.cla()
# ax.whatever for the rest of your plotting
return artists_added
and then call it like
data = some_function()
arts = my_plotter(plt.gca(), data, ...)
or do
fig, ax = plt.subplots()
and then call your plotting function like
arts = my_plotter(ax, data, ...)
I had almost the same issue and I solved it by assigning a name for each plot.
def acc(train_acc, test_acc, savename):
plt.figure(savename) # If you remove this line, the plots will be added to the same plot. But, when you assign a figure, each plot will appear in a separate plot.
ep = np.arange(len(train_acc)) + 1
plt.plot(ep, train_acc, color="blue", linewidth=1, linestyle="-", label="Train")
plt.plot(ep, test_acc, color="red", linewidth=1, linestyle="-", label="Test")
plt.title("NDCG")
plt.xlabel("Iteration")
plt.ylabel("NDCG#K")
plt.legend(loc='lower right')
plt.savefig(savename)
I have small Windows module that relies on the ctypes core module. On the project RTD site the page for the module comes up empty. Looking at the latest almost successful build log https://readthedocs.org/builds/apt/2900858/ there is a failure during make html stage.
File "/var/build/user_builds/apt/checkouts/latest/knownpaths.py", line 5, in <module>
from ctypes import windll, wintypes
File "/usr/lib/python2.7/ctypes/wintypes.py", line 23, in <module>
class VARIANT_BOOL(_SimpleCData):
ValueError: _type_ 'v' not supported
Following the FAQ entry https://read-the-docs.readthedocs.org/en/latest/faq.html#i-get-import-errors-on-libraries-that-depend-on-c-modules I tried to fake import ctypes using mock, but doing so cause the build to fail completely. From what I can tell, but I'm no means an expert in this area, it's because mock itself is missing some math functions:
File "/var/build/user_builds/apt/checkouts/latest/knownpaths.py", line 13, in GUID
("Data4", wintypes.BYTE * 8)
TypeError: unsupported operand type(s) for *: 'Mock' and 'int'
Research on the error leads to only 3 search hits, the most relevant about Mock missing (at least) a true division operator: https://mail.python.org/pipermail/python-bugs-list/2014-March/235709.html
Am I following the right path? Can ctypes be used in a project on RTD and I just need to persevere, or do I need to give up and just use sphinx from my local machine?
Here is the current mock block from my conf.py:
try:
#py3 import
from unittest.mock import MagicMock
except ImportError:
#py27 import
from mock import Mock as MagicMock
class Mock(MagicMock):
#classmethod
def __getattr__(cls, name):
return Mock()
MOCK_MODULES = ['ctypes']
sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES)
// this is a cross post from https://github.com/rtfd/readthedocs.org/issues/1342. Zero responses after a week so am looking farther afield. //
Initially I thought it was ctypes itself that needed to be mocked, but
it turns out I needed to work closer to home and mock the module which
calls ctypes, not ctypes itself.
- MOCK_MODULES = ['ctypes']
+ MOCK_MODULES = ['knownpaths']
Thank you to #Dunes, whose comment I thought was off-track and not going to help. However it gave just enough of a turning to my mind and path of investigation to land me in the right place after all. Not all teachings look like teaching when they first grace one's attention. ;-)
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import time
fig=plt.figure()
ax1=fig.add_subplot(1,1,1)
def animate():
data = np.loadtxt("new.txt")
ax1.plot(data[:,0], data[:,1])
return
ani=animation.FuncAnimation(fig,animate,frames=1000)
plt.show()
the error poping up is
TypeError: animate() takes no argument(1 given)
what to do??
There seem to be two errors in your code:
The file you are trying to read does really not exist or is not readable by your script. (That is why you get the error message.)
As Hima correctly says, the FuncAnimation expects to receive a function (i.e. animate) whereas animate() is the return value of the function.
Then there is one thing you should consider. If your data is in a simple file, you might try using the numpy.loadtxt to read that file. In that case your function animate would be something like:
import numpy as np
def animate():
data = np.loadtxt("myfile.txt")
ax1.plot(data[:,0], data[:,1])
However, even after that you will end up with an increasing number of plots, because every plot command creates a new line. Instead you might want to have a look at the set_xdata method. (As I guess this is a school assignment, I won't give the full solution.)