I am working on script to plot data in excel sheets using openpyxl module i am able to plot the data , could not find a way to set limit on axis while plotting
here is my code
ws2 = ws2 = wb.create_sheet()
xvalues = Reference(ws2, (2, 1), (10, 1))
yvalues = Reference(ws2, (2,2), (10,2))
xseries = Series(xvalues, title="First series of values")
yseries = Series(yvalues, title="Second series of values",xvalues = xseries)
chart = ScatterChart()
chart.append(yseries)
ws2.add_chart(chart)
ws2.save("C5122_534_09112014.xlsx")
Initially the chart module was setup to calculate axes maxima and minima for you. You can override this by setting auto_axis=False when you create the chart. You can then set the maximum and minimum for an axis:
chart = ScatterChart(auto_axis=False)
chart.x_axis.min = 5
chart.x_axis.max = 10
chart.x_axis.unit = 1
In 2.2 the default will not be to try and be so clever.
Related
I would like to draw plots which preserve symbolic meaning for certain numeric values.
In a isympy shell I can write:
T = Symbol('T')
plot(exp(x/T).subs(T, 5))
Which gives the following plot
I don't care much about the numeric tick labels in the plot. What I am interested in is where the x axis equals T=5 the y axis should equal e=2.718. In other words I want to discard all tick labels on both axis and only have one tick label on the x axis for T and one label on the y axis for e.
Is something like this possible?
According to Sympy and plotting, you can customize a sympy plot via accessing ._backend.ax. In my current version I needed ._backend.ax[0].
Here is how your plot could be adapted:
from sympy import Symbol, plot, exp
t_val = 5
T = Symbol('T')
plot1 = plot(exp(x / T).subs(T, t_val))
fig = plot1._backend.fig
ax = plot1._backend.ax[0]
ax.set_xticks([t_val])
ax.set_xticklabels([str(T)])
e_val = exp(1).evalf()
ax.set_yticks([e_val])
ax.set_yticklabels(["e"]) # or ax.set_yticklabels([f"{e_val:.3f}"]) ?
ax.plot([t_val, t_val, 0], [0, e_val, e_val], color='dodgerblue', ls='--')
fig.canvas.draw()
After grouping etc. I get a Series like in the example below. I would like to show the average numbers for each bar. The code below shows only one entry (of course, as I have only one "legend"). Could anyone one suggest a smart way of showing these numbers?
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib
matplotlib.style.use('ggplot')
import pandas
# create Series
dict_ = {"Business" : 104.04,"Economy":67.04, "Markets":58.56, "Companies":38.48}
s = pandas.Series(data=dict_)
# plot it
ax = s.plot(kind='bar', color='#43C6DB', stacked=True, figsize=(20, 10), legend=False)
plt.tick_params(axis='both', which='major', labelsize=14)
plt.xticks(rotation=30) #rotate labels
# Shrink current axis by 20%
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
#create new legend
legend = ['%s (%.1f a day)' %(i, row/7) for i, row in s.iteritems()]
# Put the legend to the right of the current axis
L = ax.legend(legend, loc='center left', bbox_to_anchor=(1, 0.5), fontsize=18)
plt.show()
The legend only has a single entry. This is a handle of a blue bar. Therefore even if you set the labels to a longer list, only the first element of that list is used as label for the existing handle.
The idea can be to duplicate the legend handle to have the same size as the labels
legend = ['%s (%.1f a day)' %(i, row/7) for i, row in s.iteritems()]
h,l = ax.get_legend_handles_labels()
L = ax.legend(handles = h*len(legend), labels=legend, loc='center left',
bbox_to_anchor=(1, 0.5), fontsize=18)
I am using plotly (to be able to get point information when I hoover over) to visualise my clustered scatter plot. I am having trouble with assigning different colours to the clusters I have produced by using KMeans. When plotting this in matplotlib.pyplot (as plt) I use the following code:
plt.scatter(restult[:,0], result[:,1], c=cluster_labels
cluster_labels being:
n_clusters = 3
km = KMeans(n_clusters).fit(result)
labels = km.labels_
And it works totally fine, but I need the hoover info.
This is where I am at so far with plotly:
trace = go.Scatter(
x = result[:,0],
y = result[:,1],
mode = 'markers',
text = index, # I want to see the index of each point
)
data = [trace]
# Plot and embed in ipython notebook!
py.iplot(data, filename='basic-scatter')
I appreciate the help!
Let's use the iris data set
The labels from kmeans are used as colors (marker=dict(color=kmeans.labels_)), just like in matplotlib
from sklearn import datasets
from sklearn import cluster
import plotly
plotly.offline.init_notebook_mode()
iris = datasets.load_iris()
kmeans = cluster.KMeans(n_clusters=3,
random_state=42).fit(iris.data[:,0:2])
data = [plotly.graph_objs.Scatter(x=iris.data[:,0],
y=iris.data[:,1],
mode='markers',
marker=dict(color=kmeans.labels_)
)
]
plotly.offline.iplot(data)
Just to expand on Maxmimilian's method - if you're using sklearn version >=0.17 then you'll need to reshape your array since passing 1d arrays is deprecated in 0.17.
Here's an example with reshaping:
x = df[df.columns[1]]
x = x.values.reshape(-1,1)
y = df[df.columns[2]]
y = y.values.reshape(-1,1)
kmeans = cluster.KMeans(n_clusters = 3, random_state = 0).fit(x, y)
trace1 = go.Scatter(
x = df[df.columns[1]],
y = df[df.columns[2]],
mode = 'markers',
marker=dict(color=kmeans.labels_,
size = 7.5,
line = dict(width=2)
),
text = df.index,
name='Actual'
)
i'm trying to generate a heatmap with custom colors for each cell based on the values in Python.
data = [ [0,3,2,5],[2,3,3,0],...,[0,0,2,2]]
colors = {0:'red',2:'blue',3:'green',5:'purple'}
Anyone could help?
This is a MWE of it working:
from matplotlib import colors
data = array([[1,2,3],[2,3,5], [3,1,2]])
cols = {1:'red',2:'blue',3:'green',5:'purple'}
cvr = colors.ColorConverter()
tmp = sorted(cols.keys())
cols_rgb = [cvr.to_rgb(cols[k]) for k in tmp]
intervals = array(tmp + [tmp[-1]+1]) - 0.5
cmap, norm = colors.from_levels_and_colors(intervals,cols_rgb)
plt.pcolor(data,cmap = cmap, norm = norm)
Here's the result:
I am able to generate a heatmap with quantity overlaid on the graphic as a visual of a pivot table. I would like to have a column next to the heatmap which shows the summation of rows and I would like to have a row under the heatmap that shows the summation of columns.
Is there a way to incorporate this into the heatmap figure? pv is my pivot table which I use to generate the heatmap figure. I would like to have a column on the right of the chart which has the summed values for each row. Likewise, I would like to have a row on the bottom of the chart which has the summed values for each column.
fig = plt.figure(figsize = (20,10))
mask = np.zeros_like(pv)
mask[np.tril_indices_from(mask)] = True
#with sns.axes_style("white"):
ax = sns.heatmap(pv, annot=True, cmap="YlGnBu",mask=mask, linecolor='b', cbar = False)
ax.xaxis.tick_top()
plt.xticks(rotation=90)
#Paul H subplot suggestion did work for my purposes. The code below got me the figure shown. Not sure if this is the most resource efficient method but it got me what I needed.
fig = plt.figure(figsize=(20,15))
ax1 = plt.subplot2grid((20,20), (0,0), colspan=19, rowspan=19)
ax2 = plt.subplot2grid((20,20), (19,0), colspan=19, rowspan=1)
ax3 = plt.subplot2grid((20,20), (0,19), colspan=1, rowspan=19)
mask = np.zeros_like(pv)
mask[np.tril_indices_from(mask)] = True
sns.heatmap(pv, ax=ax1, annot=True, cmap="YlGnBu",mask=mask, linecolor='b', cbar = False)
ax1.xaxis.tick_top()
ax1.set_xticklabels(pv.columns,rotation=40)
sns.heatmap((pd.DataFrame(pv.sum(axis=0))).transpose(), ax=ax2, annot=True, cmap="YlGnBu", cbar=False, xticklabels=False, yticklabels=False)
sns.heatmap(pd.DataFrame(pv.sum(axis=1)), ax=ax3, annot=True, cmap="YlGnBu", cbar=False, xticklabels=False, yticklabels=False)