Iterate Pandas Series to create a new chart legend - python-2.7

After grouping etc. I get a Series like in the example below. I would like to show the average numbers for each bar. The code below shows only one entry (of course, as I have only one "legend"). Could anyone one suggest a smart way of showing these numbers?
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib
matplotlib.style.use('ggplot')
import pandas
# create Series
dict_ = {"Business" : 104.04,"Economy":67.04, "Markets":58.56, "Companies":38.48}
s = pandas.Series(data=dict_)
# plot it
ax = s.plot(kind='bar', color='#43C6DB', stacked=True, figsize=(20, 10), legend=False)
plt.tick_params(axis='both', which='major', labelsize=14)
plt.xticks(rotation=30) #rotate labels
# Shrink current axis by 20%
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
#create new legend
legend = ['%s (%.1f a day)' %(i, row/7) for i, row in s.iteritems()]
# Put the legend to the right of the current axis
L = ax.legend(legend, loc='center left', bbox_to_anchor=(1, 0.5), fontsize=18)
plt.show()

The legend only has a single entry. This is a handle of a blue bar. Therefore even if you set the labels to a longer list, only the first element of that list is used as label for the existing handle.
The idea can be to duplicate the legend handle to have the same size as the labels
legend = ['%s (%.1f a day)' %(i, row/7) for i, row in s.iteritems()]
h,l = ax.get_legend_handles_labels()
L = ax.legend(handles = h*len(legend), labels=legend, loc='center left',
bbox_to_anchor=(1, 0.5), fontsize=18)

Related

Plotting graph using pylab

I am trying to plot a graph. It has a list contains action name (text) and another list which contains action's frequency (int).
I want to plot a connected graph. This is the code I've written:
xTicks=np.array(action)
x=np.array(count)
y=np.array(freq)
pl.xticks(x,xTicks)
pl.xticks(rotation=90)
pl.plot(x,y)
pl.show()
In the list xTicks, I have actions and in the list y, I have their frequencies .
With the above code, I am getting this graph:
Why am I getting extra spaces on x axis? It should be symmetric and the size of lists are 130-135 so how can I scroll it?
You need to set x to an evenly spaced list in order to get your x ticks to be evenly spaced. The following is an example with some made up data:
import matplotlib.pyplot as plt
import numpy as np
action = ["test1", "test2", "test3", "test4", "test5", "test6", "test7", "test8", "test9"]
freq = [5,3,7,4,8,3,5,1,12]
y=np.array(freq)
xTicks=np.array(action)
x = np.arange(0,len(action),1) # evenly spaced list with the same length as "freq"
plt.plot(x,y)
plt.xticks(x, xTicks, rotation=90)
plt.show()
This produces the following plot:
Update:
A simple example of a slider is shown below. You will have to make changes to this in order to get it exactly how you want but it will be a start:
from matplotlib.widgets import Slider
freq = [5,3,7,4,8,3,5,1,12,5,3,7,4,8,3,5,1,12,5,3,7,4,8,3,5,1,12,4,9,1]
y=np.array(freq)
x = np.arange(0,len(freq),1) # evenly spaced list with the same length as "action"
fig, ax = plt.subplots()
plt.subplots_adjust(left=0.25, bottom=0.25)
l, = plt.plot(x, y, lw=2, color='red')
axfreq = plt.axes([0.25, 0.1, 0.65, 0.03], facecolor="lightblue")
sfreq = Slider(axfreq, 'Slider', 0.1, 10, valinit=3)
def update(val):
l.set_xdata(val* x)
fig.canvas.draw_idle()
sfreq.on_changed(update)
plt.show()
This produces the following graph which has a slider:

How to make tabular legend for geopandas plot

I am plotting a choropleth map using geopandas and I need to plot a customized tabular legend. This question's answer shows how to obtain a tabular legend for a contourf plot.
And I'am using it in the code bellow :
import pandas as pd
import pysal as ps
import geopandas as gp
import numpy as np
import matplotlib.pyplot as plt
pth = 'outcom.shp'
tracts = gp.GeoDataFrame.from_file(pth)
ax = tracts.plot(column='Density', scheme='QUANTILES')
valeur = np.array([.1,.45,.7])
text=[["Faible","Ng<1,5" ],["Moyenne","1,5<Ng<2,5"],[u"Elevee", "Ng>2,5"]]
colLabels = ["Exposition", u"Densite"]
tab = ax.table(cellText=text, colLabels=colLabels, colWidths = [0.2,0.2], loc='lower right', cellColours=plt.cm.hot_r(np.c_[valeur,valeur]))
plt.show()
And here's the result i get :
So basically, as you can see there is no link between the colors of the classes in the map and the table. I need to have the exact colors that i have in the table shown in the map. The 'NG value' shown in the legend should be extracted from the column 'DENSITY' that i am plotting.
However, since I do not have a contour plot to extract the colormap from, I'm lost on how to link the tabular legend and the map's colors.
Note: This answer is outdated. Modern geopandas allows to use a normal legend via legend=True argument. I still keep it here for reference though, or in case someone wants a truely tabular legend.
The geopandas plot does not support adding a legend. It also does not provide access to its plotting object and only returns an axes with the shapes as polygons. (It does not even provide a PolyCollection to work with). It is therefore a lot of tedious work to create a normal legend for such a plot.
Fortunately some of this work is already beeing done in the example notebook Choropleth classification with PySAL and GeoPandas - With legend
So we need to take this code and implement the custom tabular legend which comes from this answer.
Here is the complete code:
def __pysal_choro(values, scheme, k=5):
""" Wrapper for choropleth schemes from PySAL for use with plot_dataframe
Parameters
----------
values
Series to be plotted
scheme
pysal.esda.mapclassify classificatin scheme ['Equal_interval'|'Quantiles'|'Fisher_Jenks']
k
number of classes (2 <= k <=9)
Returns
-------
values
Series with values replaced with class identifier if PySAL is available, otherwise the original values are used
"""
try:
from pysal.esda.mapclassify import Quantiles, Equal_Interval, Fisher_Jenks
schemes = {}
schemes['equal_interval'] = Equal_Interval
schemes['quantiles'] = Quantiles
schemes['fisher_jenks'] = Fisher_Jenks
s0 = scheme
scheme = scheme.lower()
if scheme not in schemes:
scheme = 'quantiles'
print('Unrecognized scheme: ', s0)
print('Using Quantiles instead')
if k < 2 or k > 9:
print('Invalid k: ', k)
print('2<=k<=9, setting k=5 (default)')
k = 5
binning = schemes[scheme](values, k)
values = binning.yb
except ImportError:
print('PySAL not installed, setting map to default')
return binning
def plot_polygon(ax, poly, facecolor='red', edgecolor='black', alpha=0.5, linewidth=1):
""" Plot a single Polygon geometry """
from descartes.patch import PolygonPatch
a = np.asarray(poly.exterior)
# without Descartes, we could make a Patch of exterior
ax.add_patch(PolygonPatch(poly, facecolor=facecolor, alpha=alpha))
ax.plot(a[:, 0], a[:, 1], color=edgecolor, linewidth=linewidth)
for p in poly.interiors:
x, y = zip(*p.coords)
ax.plot(x, y, color=edgecolor, linewidth=linewidth)
def plot_multipolygon(ax, geom, facecolor='red', edgecolor='black', alpha=0.5, linewidth=1):
""" Can safely call with either Polygon or Multipolygon geometry
"""
if geom.type == 'Polygon':
plot_polygon(ax, geom, facecolor=facecolor, edgecolor=edgecolor, alpha=alpha, linewidth=linewidth)
elif geom.type == 'MultiPolygon':
for poly in geom.geoms:
plot_polygon(ax, poly, facecolor=facecolor, edgecolor=edgecolor, alpha=alpha, linewidth=linewidth)
import numpy as np
from geopandas.plotting import (plot_linestring, plot_point, norm_cmap)
def plot_dataframe(s, column=None, colormap=None, alpha=0.5,
categorical=False, legend=False, axes=None, scheme=None,
k=5, linewidth=1):
""" Plot a GeoDataFrame
Generate a plot of a GeoDataFrame with matplotlib. If a
column is specified, the plot coloring will be based on values
in that column. Otherwise, a categorical plot of the
geometries in the `geometry` column will be generated.
Parameters
----------
GeoDataFrame
The GeoDataFrame to be plotted. Currently Polygon,
MultiPolygon, LineString, MultiLineString and Point
geometries can be plotted.
column : str (default None)
The name of the column to be plotted.
categorical : bool (default False)
If False, colormap will reflect numerical values of the
column being plotted. For non-numerical columns (or if
column=None), this will be set to True.
colormap : str (default 'Set1')
The name of a colormap recognized by matplotlib.
alpha : float (default 0.5)
Alpha value for polygon fill regions. Has no effect for
lines or points.
legend : bool (default False)
Plot a legend (Experimental; currently for categorical
plots only)
axes : matplotlib.pyplot.Artist (default None)
axes on which to draw the plot
scheme : pysal.esda.mapclassify.Map_Classifier
Choropleth classification schemes
k : int (default 5)
Number of classes (ignored if scheme is None)
Returns
-------
matplotlib axes instance
"""
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
from matplotlib.colors import Normalize
from matplotlib import cm
if column is None:
raise NotImplementedError
#return plot_series(s.geometry, colormap=colormap, alpha=alpha, axes=axes)
else:
if s[column].dtype is np.dtype('O'):
categorical = True
if categorical:
if colormap is None:
colormap = 'Set1'
categories = list(set(s[column].values))
categories.sort()
valuemap = dict([(j, v) for (v, j) in enumerate(categories)])
values = [valuemap[j] for j in s[column]]
else:
values = s[column]
if scheme is not None:
binning = __pysal_choro(values, scheme, k=k)
values = binning.yb
# set categorical to True for creating the legend
categorical = True
binedges = [binning.yb.min()] + binning.bins.tolist()
categories = ['{0:.2f} - {1:.2f}'.format(binedges[i], binedges[i+1]) for i in range(len(binedges)-1)]
cmap = norm_cmap(values, colormap, Normalize, cm)
if axes == None:
fig = plt.gcf()
fig.add_subplot(111, aspect='equal')
ax = plt.gca()
else:
ax = axes
for geom, value in zip(s.geometry, values):
if geom.type == 'Polygon' or geom.type == 'MultiPolygon':
plot_multipolygon(ax, geom, facecolor=cmap.to_rgba(value), alpha=alpha, linewidth=linewidth)
elif geom.type == 'LineString' or geom.type == 'MultiLineString':
raise NotImplementedError
#plot_multilinestring(ax, geom, color=cmap.to_rgba(value))
# TODO: color point geometries
elif geom.type == 'Point':
raise NotImplementedError
#plot_point(ax, geom, color=cmap.to_rgba(value))
if legend:
if categorical:
rowtitle = ["Moyenne"] * len(categories)
rowtitle[0] = "Faible"; rowtitle[-1] = u"Elevée"
text=zip(rowtitle, categories)
colors = []
for i in range(len(categories)):
color = list(cmap.to_rgba(i))
color[3] = alpha
colors.append(color)
colLabels = ["Exposition", u"Densité"]
tab=plt.table(cellText=text, colLabels=colLabels,
colWidths = [0.2,0.2], loc='upper left',
cellColours=zip(colors, colors))
else:
# TODO: show a colorbar
raise NotImplementedError
plt.draw()
return ax
if __name__ == "__main__":
import pysal as ps
import geopandas as gp
import matplotlib.pyplot as plt
pth = ps.examples.get_path("columbus.shp")
tracts = gp.GeoDataFrame.from_file(pth)
ax = plot_dataframe(tracts, column='CRIME', scheme='QUANTILES', k=5, colormap='OrRd', legend=True)
plt.show()
resulting in the following image:
your problem is in cmap :
ax = tracts.plot(......scheme='QUANTILES',cmap='jet')
and :
tab = ...... cellColours=plt.cm.jet(np.c_[valeur,valeur]))

How can I have a bar next to python seaborn heatmap which shows the summation of row values?

I am able to generate a heatmap with quantity overlaid on the graphic as a visual of a pivot table. I would like to have a column next to the heatmap which shows the summation of rows and I would like to have a row under the heatmap that shows the summation of columns.
Is there a way to incorporate this into the heatmap figure? pv is my pivot table which I use to generate the heatmap figure. I would like to have a column on the right of the chart which has the summed values for each row. Likewise, I would like to have a row on the bottom of the chart which has the summed values for each column.
fig = plt.figure(figsize = (20,10))
mask = np.zeros_like(pv)
mask[np.tril_indices_from(mask)] = True
#with sns.axes_style("white"):
ax = sns.heatmap(pv, annot=True, cmap="YlGnBu",mask=mask, linecolor='b', cbar = False)
ax.xaxis.tick_top()
plt.xticks(rotation=90)
#Paul H subplot suggestion did work for my purposes. The code below got me the figure shown. Not sure if this is the most resource efficient method but it got me what I needed.
fig = plt.figure(figsize=(20,15))
ax1 = plt.subplot2grid((20,20), (0,0), colspan=19, rowspan=19)
ax2 = plt.subplot2grid((20,20), (19,0), colspan=19, rowspan=1)
ax3 = plt.subplot2grid((20,20), (0,19), colspan=1, rowspan=19)
mask = np.zeros_like(pv)
mask[np.tril_indices_from(mask)] = True
sns.heatmap(pv, ax=ax1, annot=True, cmap="YlGnBu",mask=mask, linecolor='b', cbar = False)
ax1.xaxis.tick_top()
ax1.set_xticklabels(pv.columns,rotation=40)
sns.heatmap((pd.DataFrame(pv.sum(axis=0))).transpose(), ax=ax2, annot=True, cmap="YlGnBu", cbar=False, xticklabels=False, yticklabels=False)
sns.heatmap(pd.DataFrame(pv.sum(axis=1)), ax=ax3, annot=True, cmap="YlGnBu", cbar=False, xticklabels=False, yticklabels=False)

Multi-Axis Graph with Line on top. Matplotlib

I'm attempting to make use of twinx() to create a bar/line combo graph with the line visible on top of the bar. Currently this is how it appears:
I also need the line chart to be plotted on the left vertical axis (ax) and the bar on the right (ax2) as it currently is. If I plot the line on the second axis it does appear on top, but obviously it appears on the wrong axis (right)
Here's my code:
self.ax2=ax.twinx()
df[['Opportunities']].plot(kind='bar', stacked=False, title=get_title, color='grey', ax=self.ax2, grid=False)
ax.plot(ax.get_xticks(),df[['Percentage']].values, linestyle='-', marker='o', color='k', linewidth=1.0)
lines, labels = ax.get_legend_handles_labels()
lines2, labels2 = self.ax2.get_legend_handles_labels()
ax.legend(lines + lines2, labels + labels2, loc='lower right')
Also having trouble with the labels, but one thing at a time.
It appears, by default, that the artists are drawn on ax first, then the
artists on the twin axes ax2 on top. So since in your code the line plot was drawn on ax and the bar plot on ax2, the bar plot sits on top of (and obscures) the line.
(I thought I could change this by specifying zorder, but that attempt did not
work... )
So one way to solve the problem is to use ax to draw the bar plot and ax2 to draw the line. That will place the line on top of the bars. It will also, by default, place the ytick labels for ax (the bar plot) on the left, and the ytick labels for ax2 (the line) on the right. However, you can use
ax.yaxis.set_ticks_position("right")
ax2.yaxis.set_ticks_position("left")
to swap the location of the left and right ytick labels.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
np.random.seed(2015)
N = 16
df = pd.DataFrame({'Opportunities': np.random.randint(0, 30, size=N),
'Percentage': np.random.randint(0, 100, size=N)},
index=pd.date_range('2015-3-15', periods=N, freq='B').date)
fig, ax = plt.subplots()
df[['Opportunities']].plot(kind='bar', stacked=False, title='get_title',
color='grey', ax=ax, grid=False)
ax2 = ax.twinx()
ax2.plot(ax.get_xticks(), df[['Percentage']].values, linestyle='-', marker='o',
color='k', linewidth=1.0, label='percentage')
lines, labels = ax.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax.legend(lines + lines2, labels + labels2, loc='best')
ax.yaxis.set_ticks_position("right")
ax2.yaxis.set_ticks_position("left")
fig.autofmt_xdate()
plt.show()
yields
Alternatively, the zorder of the axes can be set so as to draw ax above ax2. Paul Ivanov shows how:
ax.set_zorder(ax2.get_zorder()+1) # put ax in front of ax2
ax.patch.set_visible(False) # hide the 'canvas'
ax2.patch.set_visible(True) # show the 'canvas'
Thus,
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
np.random.seed(2015)
N = 16
df = pd.DataFrame({'Opportunities': np.random.randint(0, 30, size=N),
'Percentage': np.random.randint(0, 100, size=N)},
index=pd.date_range('2015-3-15', periods=N, freq='B').date)
fig, ax = plt.subplots()
ax2 = ax.twinx()
df[['Opportunities']].plot(kind='bar', stacked=False, title='get_title',
color='grey', ax=ax2, grid=False)
ax.plot(ax.get_xticks(), df[['Percentage']].values, linestyle='-', marker='o',
color='k', linewidth=1.0, label='percentage')
lines, labels = ax.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax.legend(lines + lines2, labels + labels2, loc='best')
ax.set_zorder(ax2.get_zorder()+1) # put ax in front of ax2
ax.patch.set_visible(False) # hide the 'canvas'
ax2.patch.set_visible(True) # show the 'canvas'
fig.autofmt_xdate()
plt.show()
yields the same result without having to swap the roles played by ax and ax2.

how to set limts on x and y axis Openpyxl charts

I am working on script to plot data in excel sheets using openpyxl module i am able to plot the data , could not find a way to set limit on axis while plotting
here is my code
ws2 = ws2 = wb.create_sheet()
xvalues = Reference(ws2, (2, 1), (10, 1))
yvalues = Reference(ws2, (2,2), (10,2))
xseries = Series(xvalues, title="First series of values")
yseries = Series(yvalues, title="Second series of values",xvalues = xseries)
chart = ScatterChart()
chart.append(yseries)
ws2.add_chart(chart)
ws2.save("C5122_534_09112014.xlsx")
Initially the chart module was setup to calculate axes maxima and minima for you. You can override this by setting auto_axis=False when you create the chart. You can then set the maximum and minimum for an axis:
chart = ScatterChart(auto_axis=False)
chart.x_axis.min = 5
chart.x_axis.max = 10
chart.x_axis.unit = 1
In 2.2 the default will not be to try and be so clever.