How to set offset with matplotlib - python-2.7

I'm trying to remove the offset that matplotlib automatically put on my graphs. For example, with the following code:
x=np.array([1., 2., 3.])
y=2.*x*1.e7
MyFig = plt.figure()
MyAx = MyFig.add_subplot(111)
MyAx.plot(x,y)
I obtain the following result (sorry, I cannot post image): the y-axis have the ticks 2, 2.5, 3, ..., 6, with a unique "x10^7" at the top of the y axis.
I would like to remove the "x10^7" from the top of the axis, and making it appearing with each tick (2x10^7, 2.5x10^7, etc...). If I understood well what I saw in other topics, I have to play with the use_Offset variable. So I tried the following thing:
MyFormatter = MyAx.axes.yaxis.get_major_formatter()
MyFormatter.useOffset(False)
MyAx.axes.yaxis.set_major_formatter(MyFormatter)
without any success (result unchanged).
Am I doing something wrong? How can I change this behaviour? Or have I to manually set the ticks ?
Thanks by advance for any help !

You can use the FuncFormatter from the ticker module to format the ticklabels as you please:
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.ticker import FuncFormatter
x=np.array([1., 2., 3.])
y=2.*x*1.e7
MyFig = plt.figure()
MyAx = MyFig.add_subplot(111)
def sci_notation(x, pos):
return "${:.1f} \\times 10^{{6}}$".format(x / 1.e7)
MyFormatter = FuncFormatter(sci_notation)
MyAx.axes.yaxis.set_major_formatter(MyFormatter)
MyAx.plot(x,y)
plt.show()
On a side note; the "x10^7" value that appears at the top of your axis is not an offset, but a factor used in scientific notation. This behavior can be disabled by calling MyFormatter.use_scientific(False). Numbers will then be displayed as decimals.
An offset is a value you have to add (or subtract) to the tickvalues rather than multiply with, as the latter is a scale.
For reference, the line
MyFormatter.useOffset(False)
should be
MyFormatter.set_useOffset(False)
as the first one is a bool (can only have the values True or False), which means it can not be called as a method. The latter is the method used to enable/disable the offset.

Related

Integrating an array in scipy with bounds.

I am trying to integrate over an array of data, but with bounds. Therfore I planned to use simps (scipy.integrate.simps). Because simps itself does not support bounds I decided to feed it only the selection of my data I want to integrate over. Yet this leads to strange results which are twice as big as the expected outcome.
What am I doing wrong, or what am I missing, or missunderstanding?
# -*- coding: utf-8 -*-
from scipy import integrate
from scipy import interpolate
import numpy as np
import matplotlib.pyplot as plt
# my data
x = np.linspace(-10, 10, 30)
y = x**2
# but I only want to integrate from 3 to 5
f = interpolate.interp1d(x, y)
x_selection = np.linspace(3, 5, 10)
y_selection = f(x_selection)
# quad returns the expected result
print 'quad', integrate.quad(f, 3, 5), '<- the expected value (includig error estimation)'
# but simps returns an uexpected result, when using the selected data
print 'simps', integrate.simps(x_selection, y_selection), '<- twice as big'
print 'trapz', integrate.trapz(x_selection, y_selection), '<- also twice as big'
plt.plot(x, y, marker='.')
plt.fill_between(x, y, 0, alpha=0.5)
plt.plot(x_selection, y_selection, marker='.')
plt.fill_between(x_selection, y_selection, 0, alpha=0.5)
plt.show()
Windows7, python2.7, scipy1.0.0
The Arguments for simps() and trapz() are in the wrong order.
You have flipped the calling arguments; simps and trapz expect first the y dimension, and second the x dimension, as per the docs. Once you have corrected this, similar results should obtain. Note that your example function admits a trivial analytic antiderivative, which would be much cheaper to evaluate.
– N. Wouda

How can I remove the negative sign from y tick labels in matplotlib.pyplot figure?

I am using the matplotlib.pylot module to generate thousands of figures that all deal with a value called "Total Vertical Depth(TVD)". The data that these values come from are all negative numbers but the industry standard is to display them as positive (I.E. distance from zero / absolute value). My y axis is used to display the numbers and of course uses the actual value (negative) to label the axis ticks. I do not want to change the values, but am wondering how to access the text elements and just remove the negative symbols from each value(shown in red circles on the image).
Several iterations of code after diving into the matplotlib documentation has gotten me to the following code, but I am still getting an error.
locs, labels = plt.yticks()
newLabels = []
for lbl in labels:
newLabels.append((lbl[0], lbl[1], str(float(str(lbl[2])) * -1)))
plt.yticks(locs, newLabels)
It appears that some of the strings in the "labels" list are empty and therefore the cast isn't working correctly, but I don't understand how it has any empty values if the yticks() method is retrieving the current tick configuration.
#SiHA points out that if we change the data then the order of labels on the y-axis will be reversed. So we can use a ticker formatter to just change the labels without changing the data as shown in the example below:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import numpy as np
#ticker.FuncFormatter
def major_formatter(x, pos):
label = str(-x) if x < 0 else str(x)
return label
y = np.linspace(-3000,-1000,2001)
fig, ax = plt.subplots()
ax.plot(y)
ax.yaxis.set_major_formatter(major_formatter)
plt.show()
This gives me the following plot, notice the order of y-axis labels.
Edit:
based on the Amit's great answer, here's the solution if you want to edit the data instead of the tick formatter:
import matplotlib.pyplot as plt
import numpy as np
y = np.linspace(-3000,-1000,2001)
fig, ax = plt.subplots()
ax.plot(-y) # invert y-values of the data
ax.invert_yaxis() # invert the axis so that larger values are displayed at the bottom
plt.show()

Show all colors on colorbar with scatter plot

In the following I use scatter and an own ListedColormap to plot some coloured data points. In addition the corresponding colorbar is also plotted.
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap, ListedColormap, BoundaryNorm
from numpy import arange
fig, ax = plt.subplots()
my_cm = ListedColormap(['#a71b1b','#94258f','#ea99e6','#ec9510','#ece43b','#a3f8ff','#2586df','#035e0d'])
bounds=range(8)
norm = BoundaryNorm(bounds, my_cm.N)
data = [1,2,1,3,0,5,3,4]
ret = ax.scatter(range(my_cm.N), [1]*my_cm.N, c=data, edgecolors='face', cmap=my_cm, s=50)
cbar = fig.colorbar(ret, ax=ax, boundaries=arange(-0.5,8,1), ticks=bounds, norm=norm)
cbar.ax.tick_params(axis='both', which='both',length=0)
If my data is not covering each value of the boundary interval, the colorbar does not show all colours (like in the added figure). If data would be set to range(8), I get a dot of each colour and the colorbar also shows all colours.
How can I force the colorbar to show all defined colours even if data does not contain all boundary values?
You need to manually set vminand vmax in your call to ax.scatter:
ret = ax.scatter(range(my_cm.N), [1]*my_cm.N, c=data, edgecolors='face', cmap=my_cm, s=50, vmin=0, vmax=7)
resulting in
If my data is not covering each value of the boundary interval, the colorbar does not show all colours (like in the added figure).
If either vminor vmax are `None the color limits are set via the method
autoscale_None, and the minimum and maximum of your data are therefore used.
So using your code it is actually not necessary for showing all colors in the colorbar that every value of the boundary interval is covered, only the minimum and maximum need to be included.
Using e.g. data = [0,0,0,0,0,0,0,7] results in the following:
When looking for something else, I found another solution to that problem: colorbar-for-matplotlib-plot-surface-command.
In that case, I do not need to set vmin and vmax and it is also working in cases if the arrays/lists of points to plot are empty. Instead a ScalarMappable is defined and provided to colorbar instead of the scatterinstance.
import matplotlib.pyplot as plt
from matplotlib.colors import LinearSegmentedColormap, ListedColormap, BoundaryNorm
import matplotlib.cm as cm
from numpy import arange
fig, ax = plt.subplots()
my_cm = ListedColormap(['#a71b1b','#94258f','#ea99e6','#ec9510','#ece43b','#a3f8ff','#2586df','#035e0d'])
bounds=range(8)
norm = BoundaryNorm(bounds, my_cm.N)
mappable = cm.ScalarMappable(cmap=my_cm)
mappable.set_array(bounds)
data = [] # also x and y can be []
ax.scatter(x=range(my_cm.N), y=[1]*my_cm.N, c=data, edgecolors='face', cmap=my_cm, s=50)
cbar = fig.colorbar(mappable, ax=ax, boundaries=arange(-0.5,8,1), ticks=bounds, norm=norm)
cbar.ax.tick_params(axis='both', which='both',length=0)

How to change the the number of digits of the mantissa using offset notation in matplotlib colorbar

I have a contour plot in matplotlib using a colorbar which is created by
from mpl_toolkits.axes_grid1 import make_axes_locatable
divider = make_axes_locatable(axe) #adjust colorbar to fig height
cax = divider.append_axes("right", size=size, pad=pad)
cbar = f.colorbar(cf,cax=cax)
cbar.ax.yaxis.set_offset_position('left')
cbar.ax.tick_params(labelsize=17)#28
t = cbar.ax.yaxis.get_offset_text()
t.set_size(15)
How can I change the colorbar ticklabels (mantissa of exponent) showing up with only 2 digits after the '.' instead of 3 (keeping the off set notation)? Is there a possibility or do I have to set the ticks manually? Thanks
I have tried to use the str formatter
cbar.ax.yaxis.set_major_formatter(FormatStrFormatter('%.2g'))
so far but this doesn't give me the desired result.
The problem is that while the FormatStrFormatter allows to set the format precisely, it is not capable of handling offsets like the 1e-7 in the case from the question.
On the other hand the default ScalarFormatter automatically selects its own format, without letting the user change it. While this is mostly desireable, in this case, we want to specify the format ourself.
A solution is to subclass the ScalarFormatter and reimplement its ._set_format() method, similar to this answer.
Note that you would want "%.2f" instead of "%.2g" to always show 2 digits after the decimal point.
import numpy as np; np.random.seed(0)
import matplotlib.pyplot as plt
import matplotlib.ticker
class FormatScalarFormatter(matplotlib.ticker.ScalarFormatter):
def __init__(self, fformat="%1.1f", offset=True, mathText=True):
self.fformat = fformat
matplotlib.ticker.ScalarFormatter.__init__(self,useOffset=offset,
useMathText=mathText)
def _set_format(self, vmin, vmax):
self.format = self.fformat
if self._useMathText:
self.format = '$%s$' % matplotlib.ticker._mathdefault(self.format)
z = (np.random.random((10,10))*0.35+0.735)*1.e-7
fig, ax = plt.subplots()
plot = ax.contourf(z, levels=np.linspace(0.735e-7,1.145e-7,10))
fmt = FormatScalarFormatter("%.2f")
cbar = fig.colorbar(plot,format=fmt)
plt.show()
Sorry for getting in the loop so late. If you still are looking for a solution, an easier way is as follows.
import matplotlib.ticker as tick
cbar.ax.yaxis.set_major_formatter(tick.FormatStrFormatter('%.2f'))
Note: it's '%.2f' instead of '%.2g'.

Fitting The Theoretical Equation To My Data

I am very, very new to python, so please bear with me, and pardon my naivety. I am using Spyder Python 2.7 on my Windows laptop. As the title suggests, I have some data, a theoretical equation, and I am attempting to fit my data, with what I believe is the Chi-squared fit. The theoretical equation I am using is
import math
import numpy as np
import scipy.optimize as optimize
import matplotlib.pylab as plt
import csv
#with open('1.csv', 'r') as datafile:
# datareader = csv.reader(datafile)
# for row in datareader:
# print ', '.join(row)
t_y_data = np.loadtxt('exerciseball.csv', dtype=float, delimiter=',', usecols=(1,4), skiprows = 1)
print(t_y_data)
t = t_y_data[:,0]
y = t_y_data[:,1]
gamma0 = [.1]
sigma = [(0.345366)/2]*(len(t))
#len(sigma)
#print(sigma)
#print(len(sigma))
#sigma is the error in our measurements, which is the radius of the object
# Dragfunction is the theoretical equation of the position as a function of time when the thing falling experiences a drag force
# This is the function we are trying to fit to our data
# t is the independent variable time, m is the mass, and D is the Diameter
#Gamma is the value of which python will vary, until chi-squared is a minimum
def Dragfunction(x, gamma):
print x
g = 9.8
D = 0.345366
m = 0.715
# num = math.sqrt(gamma)*D*g*x
# den = math.sqrt(m*g)
# frac = num/den
# print "frac", frac
return ((m)/(gamma*D**2))*math.log(math.cosh(math.sqrt(gamma/m*g)*D*g*t))
optimize.curve_fit(Dragfunction, t, y, gamma0, sigma)
This is the error message I am getting:
return ((m)/(gamma*D**2))*math.log(math.cosh(math.sqrt(gamma/m*g)*D*g*t))
TypeError: only length-1 arrays can be converted to Python scalars
My professor and I have spent about three or four hours trying to fix this. He helped me work out a lot of the problems, but this we can't seem to resolve.
Could someone please help? If there is any other information you need, please let me know.
Your error message comes from the fact that those math functions only accept a scalar, so to call functions on an array, use the numpy versions:
In [82]: a = np.array([1,2,3])
In [83]: np.sqrt(a)
Out[83]: array([ 1. , 1.41421356, 1.73205081])
In [84]: math.sqrt(a)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
----> 1 math.sqrt(a)
TypeError: only length-1 arrays can be converted to Python scalars
In the process, I happened to spot a mathematical error in your code. Your equation at top says that g is in the bottom of the square root inside the log(cosh()), but you've got it on the top because a/b*c == a*c/b in python, not a/(b*c)
log(cosh(sqrt(gamma/m*g)*D*g*t))
should instead be any one of these:
log(cosh(sqrt(gamma/m/g)*D*g*t))
log(cosh(sqrt(gamma/(m*g))*D*g*t))
log(cosh(sqrt(gamma*g/m)*D*t)) # the simplest, by canceling with the g from outside sqrt
A second error is that in your function definition, you have the parameter named x which you never use, but instead you're using t which at this point is a global variable (from your data), so you won't see an error. You won't see an effect using curve_fit since it will pass your t data to the function anyway, but if you tried to call the Dragfunction on a different data set, it would still give you the results from the t values. Probably you meant this:
def Dragfunction(t, gamma):
print t
...
return ... D*g*t ...
A couple other notes as unsolicited advice, since you said you were new to python:
You can load and "unpack" the t and y variables at once with:
t, y = np.loadtxt('exerciseball.csv', dtype=float, delimiter=',', usecols=(1,4), skiprows = 1, unpack=True)
If your error is constant, then sigma has no effect on curve_fit, as it only affects the relative weighting for the fit, so you really don't need it at all.
Below is my version of your code, with all of the above changes in place.
import numpy as np
from scipy import optimize # simplified syntax
import matplotlib.pyplot as plt # pylab != pyplot
# `unpack` lets you split the columns immediately:
t, y = np.loadtxt('exerciseball.csv', dtype=float, delimiter=',',
usecols=(1, 4), skiprows=1, unpack=True)
gamma0 = .1 # does not need to be a list
def Dragfunction(x, gamma):
g = 9.8
D = 0.345366
m = 0.715
gammaD_m = gamma*D*D/m # combination is used twice, only calculate once for (small) speedup
return np.log(np.cosh(np.sqrt(gammaD_m*g)*t)) / gammaD_m
gamma_best, gamma_var = optimize.curve_fit(Dragfunction, t, y, gamma0)