I am on python 2.7, with spyder IDE and this is my data:
Duration ptno
7432.0 X35133502100
7432.0 X35133502100
35255.0 T7956000304
35255.0 T7956000304
17502.0 T7956000304
17502.0 T7956000304
46.0 T7956000304
46.0 T7956000304
The code:
import time
import pandas as pd
import matplotlib.pyplot as plt
df1 = pd.read_csv('Nissin_11.09.2018.csv')
bx = df1.plot.bar(x='ptno', y='d', rot=0)
plt.setp(bx.get_xticklabels(),rotation=30,horizontalalignment='right')
plt.show()
I get a nice bar plot as I wanted for each value mentioned in columns Duration & ptno. For reference I am attaching image file of the plot.
But when I try to get a scatter plot with:
df1.plot.scatter(x='ptno', y='d')
It throws a error as :
ValueError: scatter requires x column to be numeric
How can I have a 'scatter' plot for my data ??
As suggested by #Hristo Iliev I used his code:
import seaborn as sns
_ = sns.stripplot(x='ptno', y='d', data=df1)
But It only plot two unique values on axis where I would like to have all values on x axis as my bar plot has x axis values.
One option is to use pure matplotlib. You need to create an array of numbers to use as the x axis, i.e. [1,2,3,4,5,...] and then change the tick labels to the value of the column ptno.
For example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df1 = pd.DataFrame({"Duration":[7432,7432,35255,35255,17502,17502,46,46],
"ptno":["X35", "X35", "T79", "T79", "T79", "T79", "T79", "T79"]})
dummy_x = np.arange(len(df1.ptno))
plt.scatter(dummy_x, df1.Duration)
plt.xticks(dummy_x, df1.ptno)
plt.show()
You cannot make scatter plots with non-numeric values as indicated by the error. In a scatter plot, the position of each point is determined by the location on the real axis of the value of each variable. Categorical or string values such as T7956000304 have no direct mapping to a position on the real axis.
What you can plot though is a series of strip plots, one for each unique value of ptno. That's easiest to do with Seaborn:
import seaborn as sns
_ = sns.stripplot(x='ptno', y='d', data=df1)
Passing a 2D array to Matplotlib's histogram function with histtype='step' seems to plot the columns in reverse order (at least from my biased, Western perspective of left-to-right).
Here's an illustration:
import matplotlib.pyplot as plt
import numpy as np
X = np.array([
np.random.normal(size=5000),
np.random.uniform(size=5000)*2.0 - 1.0,
np.random.beta(2.0,1.0,size=5000)*3.0,
]).T
trash = plt.hist(X,bins=50,histtype='step')
plt.legend(['Normal','2*Uniform-1','3*Beta(2,1)'],loc='upper left')
Produces this:
Running matplotlib version 2.0.2, python 2.7
From the documentation for legend:
in order to keep the "label" and the legend element instance together,
it is preferable to specify the label either at artist creation, or by
calling the set_label method on the
artist
I recommend to use the label keyword argument to hist:
String, or sequence of strings to match multiple datasets
The result is:
import matplotlib.pyplot as plt
import numpy as np
X = np.array([
np.random.normal(size=5000),
np.random.uniform(size=5000)*2.0 - 1.0,
np.random.beta(2.0,1.0,size=5000)*3.0,
]).T
trash = plt.hist(X,bins=50,histtype='step',
label=['Normal','2*Uniform-1','3*Beta(2,1)'])
plt.legend(loc='upper left')
plt.show()
I am using matplotlib to create the plots. I have to identify each plot with a different color which should be automatically generated by Python.
Can you please give me a method to put different colors for different plots in the same figure?
Matplotlib does this by default.
E.g.:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10)
plt.plot(x, x)
plt.plot(x, 2 * x)
plt.plot(x, 3 * x)
plt.plot(x, 4 * x)
plt.show()
And, as you may already know, you can easily add a legend:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10)
plt.plot(x, x)
plt.plot(x, 2 * x)
plt.plot(x, 3 * x)
plt.plot(x, 4 * x)
plt.legend(['y = x', 'y = 2x', 'y = 3x', 'y = 4x'], loc='upper left')
plt.show()
If you want to control the colors that will be cycled through:
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10)
plt.gca().set_color_cycle(['red', 'green', 'blue', 'yellow'])
plt.plot(x, x)
plt.plot(x, 2 * x)
plt.plot(x, 3 * x)
plt.plot(x, 4 * x)
plt.legend(['y = x', 'y = 2x', 'y = 3x', 'y = 4x'], loc='upper left')
plt.show()
If you're unfamiliar with matplotlib, the tutorial is a good place to start.
Edit:
First off, if you have a lot (>5) of things you want to plot on one figure, either:
Put them on different plots (consider using a few subplots on one figure), or
Use something other than color (i.e. marker styles or line thickness) to distinguish between them.
Otherwise, you're going to wind up with a very messy plot! Be nice to who ever is going to read whatever you're doing and don't try to cram 15 different things onto one figure!!
Beyond that, many people are colorblind to varying degrees, and distinguishing between numerous subtly different colors is difficult for more people than you may realize.
That having been said, if you really want to put 20 lines on one axis with 20 relatively distinct colors, here's one way to do it:
import matplotlib.pyplot as plt
import numpy as np
num_plots = 20
# Have a look at the colormaps here and decide which one you'd like:
# http://matplotlib.org/1.2.1/examples/pylab_examples/show_colormaps.html
colormap = plt.cm.gist_ncar
plt.gca().set_prop_cycle(plt.cycler('color', plt.cm.jet(np.linspace(0, 1, num_plots))))
# Plot several different functions...
x = np.arange(10)
labels = []
for i in range(1, num_plots + 1):
plt.plot(x, i * x + 5 * i)
labels.append(r'$y = %ix + %i$' % (i, 5*i))
# I'm basically just demonstrating several different legend options here...
plt.legend(labels, ncol=4, loc='upper center',
bbox_to_anchor=[0.5, 1.1],
columnspacing=1.0, labelspacing=0.0,
handletextpad=0.0, handlelength=1.5,
fancybox=True, shadow=True)
plt.show()
Setting them later
If you don't know the number of the plots you are going to plot you can change the colours once you have plotted them retrieving the number directly from the plot using .lines, I use this solution:
Some random data
import matplotlib.pyplot as plt
import numpy as np
fig1 = plt.figure()
ax1 = fig1.add_subplot(111)
for i in range(1,15):
ax1.plot(np.array([1,5])*i,label=i)
The piece of code that you need:
colormap = plt.cm.gist_ncar #nipy_spectral, Set1,Paired
colors = [colormap(i) for i in np.linspace(0, 1,len(ax1.lines))]
for i,j in enumerate(ax1.lines):
j.set_color(colors[i])
ax1.legend(loc=2)
The result is the following:
TL;DR No, it can't be done automatically. Yes, it is possible.
import matplotlib.pyplot as plt
my_colors = plt.rcParams['axes.prop_cycle']() # <<< note that we CALL the prop_cycle
fig, axes = plt.subplots(2,3)
for ax in axes.flatten(): ax.plot((0,1), (0,1), **next(my_colors))
Each plot (axes) in a figure (figure) has its own cycle of colors — if you don't force a different color for each plot, all the plots share the same order of colors but, if we stretch a bit what "automatically" means, it can be done.
The OP wrote
[...] I have to identify each plot with a different color which should be automatically generated by [Matplotlib].
But... Matplotlib automatically generates different colors for each different curve
In [10]: import numpy as np
...: import matplotlib.pyplot as plt
In [11]: plt.plot((0,1), (0,1), (1,2), (1,0));
Out[11]:
So why the OP request? If we continue to read, we have
Can you please give me a method to put different colors for different plots in the same figure?
and it make sense, because each plot (each axes in Matplotlib's parlance) has its own color_cycle (or rather, in 2018, its prop_cycle) and each plot (axes) reuses the same colors in the same order.
In [12]: fig, axes = plt.subplots(2,3)
In [13]: for ax in axes.flatten():
...: ax.plot((0,1), (0,1))
If this is the meaning of the original question, one possibility is to explicitly name a different color for each plot.
If the plots (as it often happens) are generated in a loop we must have an additional loop variable to override the color automatically chosen by Matplotlib.
In [14]: fig, axes = plt.subplots(2,3)
In [15]: for ax, short_color_name in zip(axes.flatten(), 'brgkyc'):
...: ax.plot((0,1), (0,1), short_color_name)
Another possibility is to instantiate a cycler object
from cycler import cycler
my_cycler = cycler('color', ['k', 'r']) * cycler('linewidth', [1., 1.5, 2.])
actual_cycler = my_cycler()
fig, axes = plt.subplots(2,3)
for ax in axes.flat:
ax.plot((0,1), (0,1), **next(actual_cycler))
Note that type(my_cycler) is cycler.Cycler but type(actual_cycler) is itertools.cycle.
I would like to offer a minor improvement on the last loop answer given in the previous post (that post is correct and should still be accepted). The implicit assumption made when labeling the last example is that plt.label(LIST) puts label number X in LIST with the line corresponding to the Xth time plot was called. I have run into problems with this approach before. The recommended way to build legends and customize their labels per matplotlibs documentation ( http://matplotlib.org/users/legend_guide.html#adjusting-the-order-of-legend-item) is to have a warm feeling that the labels go along with the exact plots you think they do:
...
# Plot several different functions...
labels = []
plotHandles = []
for i in range(1, num_plots + 1):
x, = plt.plot(some x vector, some y vector) #need the ',' per ** below
plotHandles.append(x)
labels.append(some label)
plt.legend(plotHandles, labels, 'upper left',ncol=1)
**: Matplotlib Legends not working
Matplot colors your plot with different colors , but incase you wanna put specific colors
import matplotlib.pyplot as plt
import numpy as np
x = np.arange(10)
plt.plot(x, x)
plt.plot(x, 2 * x,color='blue')
plt.plot(x, 3 * x,color='red')
plt.plot(x, 4 * x,color='green')
plt.show()
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
import numpy as np
from skspatial.objects import Line, Vector
for count in range(0,len(LineList),1):
Line_Color = np.random.rand(3,)
Line(StartPoint,EndPoint)).plot_3d(ax,c="Line"+str(count),label="Line"+str(count))
plt.legend(loc='lower left')
plt.show(block=True)
The above code might help you to add 3D lines with different colours in a randomized fashion. Your colored lines can also be referenced with a help of a legend as mentioned in the label="... " parameter.
Honestly, my favourite way to do this is pretty simple: Now this won't work for an arbitrarily large number of plots, but it will do you up to 1163. This is by using the map of all matplotlib's named colours and then selecting them at random.
from random import choice
import matplotlib.pyplot as plt
from matplotlib.colors import mcolors
# Get full named colour map from matplotlib
colours = mcolors._colors_full_map # This is a dictionary of all named colours
# Turn the dictionary into a list
color_lst = list(colours.values())
# Plot using these random colours
for n, plot in enumerate(plots):
plt.scatter(plot[x], plot[y], color=choice(color_lst), label=n)
Hello So I'm trying to trying to change the tick mark increments of the following plot to numbers that are more appropriate ie. increments of 1 on the x-axis and 10 on the y-axis.
Plot I'm trying to fix
The code I've tried is bellow:
Any help would be much appreciated!!!
import netCDF4
f = netCDF4.Dataset('AVSA.nc','r')
#plot Daily average
import matplotlib.pyplot as plt
v= f.variables['emissions'][0,0:24,0]
plt.plot(v, linestyle='-',linewidth=5.0, c='c')
plt.xlabel('Hour')
plt.ylabel('Emissions')
plt.title (' Emissions 3')
plt.ylim(0, 180)
plt.xlim(0,23)
plt.show()
You can write and position the ticks directly using:
plt.xticks(range(25),[str(i) for i in range(25)])
plt.yticks(range(0,180,10),[str(i) for i in range(0,180,10)])
In your code (I generated some data since I don't have your file) you would have:
import matplotlib.pyplot as plt
v= range(25)#f.variables['store_Bio'][0,0:24,0]
plt.plot(v, linestyle='-',linewidth=5.0, c='c')
plt.xticks(range(25),[str(i) for i in range(25)])
plt.yticks(range(0,180,10),[str(i) for i in range(0,180,10)])
plt.xlabel('Hour')
plt.ylabel('Average Biogenic Emissions')
plt.title ('Daily average Biogenic Emissions March 2013')
plt.ylim(0, 180)
plt.xlim(0,23)
plt.show()
, which results in:
Helo everyone
I need some help. I wrote this scrip:
import matplotlib.pyplot as plt
import scipy
import pyfits
import numpy as np
import re
import os
import glob
import time
global numbers
numbers=re.compile(r'(\d+)')
def numericalSort(value):
parts = numbers.split(value)
parts[1::2] = map(int, parts[1::2])
return parts
image_list=sorted(glob.glob('*.fit'), key=numericalSort)
for i in range(len(image_list)):
hdulist=pyfits.open(image_list[i])
data=hdulist[0].data
dimension=hdulist[0].header['NAXIS1']
time=hdulist[0].header['TIME']
hours=float(time[:2])*3600
minutes=float(time[3:5])*60
sec=float(time[6:])
cas=hours+minutes+sec
y=[]
for n in range(0,dimension):
y.append(data.flat[n])
maxy= max(y)
print image_list[i],cas,maxy
plt.plot([cas],[maxy],'bo')
plt.ion()
plt.draw()
This scrip read fit data file. From each file find max value which is y value and from header TIME which is x value axis.
And now my problem...When I run this scrip I get graph but only with points. How I get graph with line (line point to point)?
Thank for answer and help
Your problem may well be here:
plt.plot([cas],[maxy],'bo')
at the point that this statement is encountered, cas is a single value and maxy is also a single value -- you have only one point to plot and therefore nothing to join. Next time round the loop you plot another single point, unconnected to the previous one, and so on.
I can't be sure, but perhaps you mean to do something like:
x = []
for i in range(len(image_list)):
hdulist=pyfits.open(image_list[i])
data=hdulist[0].data
dimension=hdulist[0].header['NAXIS1']
time=hdulist[0].header['TIME']
hours=float(time[:2])*3600
minutes=float(time[3:5])*60
sec=float(time[6:])
cas=hours+minutes+sec
x.append(cas)
y=[]
for n in range(0,dimension):
y.append(data.flat[n])
maxy= max(y)
print image_list[i],cas,maxy
plt.plot(x, y ,'bo-')
plt.ion()
plt.draw()
ie plot a single line once you've collected all the x and y values. The linestyle format, bo- which provides the connecting line.
OK here is solution
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import scipy
import pyfits
import numpy as np
import re
import os
import glob
import time
global numbers
numbers=re.compile(r'(\d+)')
def numericalSort(value):
parts = numbers.split(value)
parts[1::2] = map(int, parts[1::2])
return parts
fig=plt.figure()
ax1=fig.add_subplot(1,1,1)
def animate(i):
image_list=sorted(glob.glob('*.fit'), key=numericalSort)
cas,maxy=[],[]
files=open("data.dat","wr")
for n in range(len(image_list)):
hdulist=pyfits.open(image_list[n])
data=hdulist[0].data
maxy=data.max()
time=hdulist[0].header['TIME']
hours=int(float(time[:2])*3600)
minutes=int(float(time[3:5])*60)
sec=int(float(time[6:]))
cas=hours+minutes+sec
files.write("\n{},{}".format(cas,maxy))
files.close()
pool=open('data.dat','r')
data=pool.read()
dataA=data.split('\n')
xar=[]
yar=[]
pool.close()
for line in dataA:
if len(line)>1:
x,y=line.split(',')
xar.append(int(x))
yar.append(int(y))
print xar,yar
ax1.clear()
ax1.plot(xar,yar,'b-')
ax1.plot(xar,yar,'ro')
plt.title('Light curve')
plt.xlabel('TIME')
plt.ylabel('Max intensity')
plt.grid()
This script read some values from files and plot it.