I have identified that the most time-consuming step in the execution of code is saving plots to memory. I'm using matplotlib for plotting and saving the plots. The issue is that I'm running several simulations and saving the plots resulting from these simulations; this effort is consuming an insane number of compute hours. I have verified that it is indeed the plotting that is doing the damage.
It seems that pyqtgraph renders and saves images comparatively faster than matplotlib. I want to know if something similar to the following lines of code could be implemented in pyqtgraph?
import matplotlib
matplotlib.use('Agg')
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from matplotlib.figure import Figure
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
fig = plt.figure(figsize=(6, 2.5))
rows = 1
columns = 2
gs = gridspec.GridSpec(rows, columns, hspace=0.0,wspace=0.0)
aj=0
for specie in lines:
for transition in species[specie].items():
gss = gridspec.GridSpecFromSubplotSpec(2, 1, subplot_spec=gs[aj],hspace=0.0,height_ratios=[1, 3])
ax0 = fig.add_subplot(gss[0])
ax1 = fig.add_subplot(gss[1], sharex=ax0)
ax0.plot(fitregs[specie+transition[0]+'_Vel'],fitregs['N_residue'],color='black',linewidth=0.85)
ax1.plot(fitregs[specie+transition[0]+'_Vel'],fitregs['N_flux'],color='black',linewidth=0.85)
ax1.plot(fitregs[specie+transition[0]+'_Vel'],fitregs['Best_profile'],color='red',linewidth=0.85)
ax1.xaxis.set_minor_locator(AutoMinorLocator())
ax1.tick_params(which='both', width=1)
ax1.tick_params(which='major', length=5)
ax1.tick_params(which='minor', length=2.5)
ax1.text(0.70,0.70,r'$\chi^{2}_{\nu}$'+'= {}'.format(round(red_chisqr[aj],2)),transform=ax1.transAxes)
ax1.text(0.10,0.10,'{}'.format(specie+' '+transition[0]),transform=ax1.transAxes)
ak=ak+1
aj=aj+1
canvas = FigureCanvas(fig)
canvas.print_figure('fits.png')
Example output from the above code is
I´m trying to create some graphs from a dataframe I imported. The problem is I can create an only image with both graphs. I have this output:
And I´m looking for this Output:
Here is the code:
from pandas_datareader import data
import pandas as pd
import datetime
import matplotlib.pyplot as plt
df = pd.read_csv('csv.csv', index_col = 'Totales', parse_dates=True)
df.head()
df['Subastas'].plot()
plt.title('Subastadas')
plt.xlabel('Fechas')
plt.ylabel('Cant de Subastadas')
plt.subplot()
df['Impresiones_exchange'].plot()
plt.title('Impresiones_exchange')
plt.xlabel('Fechas')
plt.ylabel('Cant de Impresiones_exchange')
plt.subplot()
plt.show()
CSV data:
Totales,Subastas,Impresiones_exchange,Importe_a_pagar_a_medio,Fill_rate,ECPM_medio
Total_07/01/2017,1596260396,30453841,19742.04,3.024863813,0.733696498
Total_07/12/2017,1336604546,57558106,43474.29,9.368463445,0.656716233
Total_07/01/2018,1285872189,33518075,20614.4,4.872889166,0.678244085
Also, I would like to save the output in an xlsx file too!
Use plt.subplots() to define two separate Axes objects, then use the ax argument of df.plot() to associate a plot to an axis:
import pandas as pd
import matplotlib.pyplot as plt
f, (ax1, ax2) = plt.subplots(2,1,figsize=(5,10))
df['Impresiones_exchange'].plot(ax=ax2)
ax1.set_title('Impresiones_exchange')
ax1.set_xlabel('Fechas')
ax1.set_ylabel('Cant de Impresiones_exchange')
df['Subastas'].plot(ax=ax1)
ax2.set_title('Subastadas')
ax2.set_xlabel('Fechas')
ax2.set_ylabel('Cant de Subastadas')
I have data like this :
import pandas as pd
import matplotlib.pyplot as plt
index={'A','B','C','D','E'}
d={'typ':[1,2,2,2,1],'value':[10,25,15,17,13]}
df=pd.DataFrame(d,index=index)
I want to plot the dataframe in horizontal bars with different colors reffering to the column 'typ'
You can use the color parameter of matplotlib's barh function:
import pandas as pd
import matplotlib.pyplot as plt
index={'A','B','C','D','E'}
d={'typ':[1,2,2,2,1],'value':[10,25,15,17,13]}
df=pd.DataFrame(d,index=index)
# define the colors for each type
colors = {1:'blue', 2:'red'}
# plot the bars
plt.bar(range(len(df)), df['value'], align='center',
color=[colors[t] for t in df['typ']])
Helo everyone
I need some help. I wrote this scrip:
import matplotlib.pyplot as plt
import scipy
import pyfits
import numpy as np
import re
import os
import glob
import time
global numbers
numbers=re.compile(r'(\d+)')
def numericalSort(value):
parts = numbers.split(value)
parts[1::2] = map(int, parts[1::2])
return parts
image_list=sorted(glob.glob('*.fit'), key=numericalSort)
for i in range(len(image_list)):
hdulist=pyfits.open(image_list[i])
data=hdulist[0].data
dimension=hdulist[0].header['NAXIS1']
time=hdulist[0].header['TIME']
hours=float(time[:2])*3600
minutes=float(time[3:5])*60
sec=float(time[6:])
cas=hours+minutes+sec
y=[]
for n in range(0,dimension):
y.append(data.flat[n])
maxy= max(y)
print image_list[i],cas,maxy
plt.plot([cas],[maxy],'bo')
plt.ion()
plt.draw()
This scrip read fit data file. From each file find max value which is y value and from header TIME which is x value axis.
And now my problem...When I run this scrip I get graph but only with points. How I get graph with line (line point to point)?
Thank for answer and help
Your problem may well be here:
plt.plot([cas],[maxy],'bo')
at the point that this statement is encountered, cas is a single value and maxy is also a single value -- you have only one point to plot and therefore nothing to join. Next time round the loop you plot another single point, unconnected to the previous one, and so on.
I can't be sure, but perhaps you mean to do something like:
x = []
for i in range(len(image_list)):
hdulist=pyfits.open(image_list[i])
data=hdulist[0].data
dimension=hdulist[0].header['NAXIS1']
time=hdulist[0].header['TIME']
hours=float(time[:2])*3600
minutes=float(time[3:5])*60
sec=float(time[6:])
cas=hours+minutes+sec
x.append(cas)
y=[]
for n in range(0,dimension):
y.append(data.flat[n])
maxy= max(y)
print image_list[i],cas,maxy
plt.plot(x, y ,'bo-')
plt.ion()
plt.draw()
ie plot a single line once you've collected all the x and y values. The linestyle format, bo- which provides the connecting line.
OK here is solution
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import scipy
import pyfits
import numpy as np
import re
import os
import glob
import time
global numbers
numbers=re.compile(r'(\d+)')
def numericalSort(value):
parts = numbers.split(value)
parts[1::2] = map(int, parts[1::2])
return parts
fig=plt.figure()
ax1=fig.add_subplot(1,1,1)
def animate(i):
image_list=sorted(glob.glob('*.fit'), key=numericalSort)
cas,maxy=[],[]
files=open("data.dat","wr")
for n in range(len(image_list)):
hdulist=pyfits.open(image_list[n])
data=hdulist[0].data
maxy=data.max()
time=hdulist[0].header['TIME']
hours=int(float(time[:2])*3600)
minutes=int(float(time[3:5])*60)
sec=int(float(time[6:]))
cas=hours+minutes+sec
files.write("\n{},{}".format(cas,maxy))
files.close()
pool=open('data.dat','r')
data=pool.read()
dataA=data.split('\n')
xar=[]
yar=[]
pool.close()
for line in dataA:
if len(line)>1:
x,y=line.split(',')
xar.append(int(x))
yar.append(int(y))
print xar,yar
ax1.clear()
ax1.plot(xar,yar,'b-')
ax1.plot(xar,yar,'ro')
plt.title('Light curve')
plt.xlabel('TIME')
plt.ylabel('Max intensity')
plt.grid()
This script read some values from files and plot it.
I have a set of data that I load into python using a pandas dataframe. What I would like to do is create a loop that will print a plot for all the elements in their own frame, not all on one. My data is in an excel file structured in this fashion:
Index | DATE | AMB CO 1 | AMB CO 2 |...|AMB CO_n | TOTAL
1 | 1/1/12| 14 | 33 |...| 236 | 1600
. | ... | ... | ... |...| ... | ...
. | ... | ... | ... |...| ... | ...
. | ... | ... | ... |...| ... | ...
n
This is what I have for code so far:
import pandas as pd
import matplotlib.pyplot as plt
ambdf = pd.read_excel('Ambulance.xlsx',
sheetname='Sheet2', index_col=0, na_values=['NA'])
print type(ambdf)
print ambdf
print ambdf['EAS']
amb_plot = plt.plot(ambdf['EAS'], linewidth=2)
plt.title('EAS Ambulance Numbers')
plt.xlabel('Month')
plt.ylabel('Count of Deliveries')
print amb_plot
for i in ambdf:
print plt.plot(ambdf[i], linewidth = 2)
I am thinking of doing something like this:
for i in ambdf:
ambdf_plot = plt.plot(ambdf, linewidth = 2)
The above was not remotely what i wanted and it stems from my unfamiliarity with Pandas, MatplotLib etc, looking at some documentation though to me it looks like matplotlib is not even needed (question 2)
So A) How can I produce a plot of data for every column in my df
and B) do I need to use matplotlib or should I just use pandas to do it all?
Thank you,
Ok, so the easiest method to create several plots is this:
import matplotlib.pyplot as plt
x=[[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]]
y=[[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]]
for i in range(len(x)):
plt.figure()
plt.plot(x[i],y[i])
# Show/save figure as desired.
plt.show()
# Can show all four figures at once by calling plt.show() here, outside the loop.
#plt.show()
Note that you need to create a figure every time or pyplot will plot in the first one created.
If you want to create several data series all you need to do is:
import matplotlib.pyplot as plt
plt.figure()
x=[[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]]
y=[[1,2,3,4],[2,3,4,5],[3,4,5,6],[7,8,9,10]]
plt.plot(x[0],y[0],'r',x[1],y[1],'g',x[2],y[2],'b',x[3],y[3],'k')
You could automate it by having a list of colours like ['r','g','b','k'] and then just calling both entries in this list and corresponding data to be plotted in a loop if you wanted to. If you just want to programmatically add data series to one plot something like this will do it (no new figure is created each time so everything is plotted in the same figure):
import matplotlib.pyplot as plt
x=[[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]]
y=[[1,2,3,4],[2,3,4,5],[3,4,5,6],[7,8,9,10]]
colours=['r','g','b','k']
plt.figure() # In this example, all the plots will be in one figure.
for i in range(len(x)):
plt.plot(x[i],y[i],colours[i])
plt.show()
If anything matplotlib has a very good documentation page with plenty of examples.
17 Dec 2019: added plt.show() and plt.figure() calls to clarify this part of the story.
Use a dictionary!!
You can also use dictionaries that allows you to have more control over the plots:
import matplotlib.pyplot as plt
# plot 0 plot 1 plot 2 plot 3
x=[[1,2,3,4],[1,4,3,4],[1,2,3,4],[9,8,7,4]]
y=[[3,2,3,4],[3,6,3,4],[6,7,8,9],[3,2,2,4]]
plots = zip(x,y)
def loop_plot(plots):
figs={}
axs={}
for idx,plot in enumerate(plots):
figs[idx]=plt.figure()
axs[idx]=figs[idx].add_subplot(111)
axs[idx].plot(plot[0],plot[1])
return figs, axs
figs, axs = loop_plot(plots)
Now you can select the plot that you want to modify easily:
axs[0].set_title("Now I can control it!")
Of course, is up to you to decide what to do with the plots. You can either save them to disk figs[idx].savefig("plot_%s.png" %idx) or show them plt.show(). Use the argument block=False only if you want to pop up all the plots together (this could be quite messy if you have a lot of plots). You can do this inside the loop_plot function or in a separate loop using the dictionaries that the function provided.
Just to add returning figs and axs is not mandatory to execute plt.show().
Here are two examples of how to generate graphs in separate windows (frames), and, an example of how to generate graphs and save them into separate graphics files.
Okay, first the on-screen example. Notice that we use a separate instance of plt.figure(), for each graph, with plt.plot(). At the end, we have to call plt.show() to put it all on the screen.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace( 0,10 )
for n in range(3):
y = np.sin( x+n )
plt.figure()
plt.plot( x, y )
plt.show()
Another way to do this, is to use plt.show(block=False) inside the loop:
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace( 0,10 )
for n in range(3):
y = np.sin( x+n )
plt.figure()
plt.plot( x, y )
plt.show( block=False )
Now, let's generate the graphs and instead, write them each to a file. Here we replace plt.show(), with plt.savefig( filename ). The difference from the previous example is that we don't have to account for ''blocking'' at each graph. Note also, that we number the file names. Here we use %03d so that we can conveniently have them in number order afterwards.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace( 0,10 )
for n in range(3):
y = np.sin( x+n )
plt.figure()
plt.plot( x, y )
plt.savefig('myfilename%03d.png'%(n))
If your requirement is to plot against one column, then feel free to use this (First import data into a pandas DF) (plots a matrix of plots with 5 columns and as many rows required):
import math
i,j=0,0
PLOTS_PER_ROW = 5
fig, axs = plt.subplots(math.ceil(len(df.columns)/PLOTS_PER_ROW),PLOTS_PER_ROW, figsize=(20, 60))
for col in df.columns:
axs[i][j].scatter(df['target_col'], df[col], s=3)
axs[i][j].set_ylabel(col)
j+=1
if j%PLOTS_PER_ROW==0:
i+=1
j=0
plt.show()
A simple way of plotting on different frames would be like:
import matplotlib.pyplot as plt
for grp in list_groups:
plt.figure()
plt.plot(grp)
plt.show()
Then python will plot multiple frames for each iteration.
We can create a for loop and pass all the numeric columns into it.
The loop will plot the graphs one by one in separate pane as we are including
plt.figure() into it.
import pandas as pd
import seaborn as sns
import numpy as np
numeric_features=[x for x in data.columns if data[x].dtype!="object"]
#taking only the numeric columns from the dataframe.
for i in data[numeric_features].columns:
plt.figure(figsize=(12,5))
plt.title(i)
sns.boxplot(data=data[i])