Bokeh tool tip displaying all the values for the line - python-2.7

I am trying to plot a line chart which includes tooltip, but the code below results in displaying all the values of the line in a tooltip instead displaying a single value for those co ordinates
#Import the library
import pandas
import itertools
import bokeh
import MySQLdb
from bokeh.plotting import figure, output_file, show
from bokeh.models import HoverTool
TOOLS='hover'
wells=['F1','F2','F3','F4','F5','F6','F7','F8','F9','F10','F11','F12','G1','G2','G3','G4','G5','G6','G7','G8','G9','G10','G11','G12']
p = figure(plot_width=800, plot_height=640,x_axis_type="datetime", tools=TOOLS)
p.title.text = 'Click on legend entries to hide the corresponding lines'
# Open database connection
db = MySQLdb.connect("localhost","user","password","db" )
#pallete for the lines
my_palette=bokeh.palettes.inferno(len(wells))
#create a statement to get the data
for name, color in zip(wells,my_palette):
stmnt='select date_time,col1,wells,test_value from db where wells="%s"'%(name)
#creating dataframe
df=pandas.read_sql(stmnt,con=db)
p.scatter(df['date_time'], df['test_value'], line_width=2, color=color, alpha=0.8, legend=name,)
#Inserting tool tip
hover = p.select(dict(type=HoverTool))
hover.tooltips = [("Wells","#wells"),("Date","#%s"%(df['date_time'])),("Values","#%s"%(df['test_value']))]
hover.mode = 'mouse'
#Adding a legend
p.legend.location = "top_right"
output_file("interactive_legend.html", title="interactive_legend.py example")
show(p)
Given below is the resultant screenshot
I am trying to get only one well,Date_time,Test_value at given mouse over instance

This code:
hover.tooltips = [
("Wells","#wells"),
("Date","#%s"%(df['date_time'])),
("Values","#%s"%(df['test_value']))
]
Does not do what you think. Let's suppose df['date_time'] has the value [10, 20, 30, 40]. Then after your string substitution, your tooltip looks like:
("Date", "#[10, 20, 30, 40]")
Which exactly explains what you are seeing. The #[10 part looks for a column named "[10" in your ColumnDataSource (because of the # in front). There isn't a column with that name, so the tooltip prints ??? to indicate it can't find data to look up. The rest 20, 30, 40 is just plain text, so it gets printed as-is. In your code, you are actually passing a Pandas series and not a list, so the string substitution also prints the Name and dtype info in the tooltip text as well.
Since you are passing sequence literals to scatter, it creates a Column Data Source for you, and the default names in the CDS it are 'x' and 'y'. My best guess, is that you actually want:
hover.tooltips = [
("Wells","#wells"),
("Date","#x"),
("Values","#y")
]
But note that you would want to do this outside the loop. As it is you are simply modifying the same hover tool over and over.

Related

Python - AttributeError: 'DataFrame' object has no attribute

I have a CSV file with various columns and everything worked perfectly for the past few months until I updated the file and got new information and now the one column does not appear to be picked up by Python. I am using Python 2.7 and have made sure I have the latest version of pandas.
When I downloaded the csv file from Yahoo Finance, I opened it in Excel and made changes to the format of the columns in order to make it more readable as all information was in one cell. I used the "Text to Column" feature and split up the data based on where the commas were.
Then I made sure that in each column there were no white spaces in the beginning of the cell using the Trim function in excel and left-aligning the data.
I tried the following and still get the same or similiar:
After the df = pd.read_csv("KIO.csv") I tried to read whether I can read the first few columns by using df.head() - but still got the same error.
I tried renaming the problematic column as suggested in a similiar post using:
df = df.rename(columns={"Close": "Closing"}) - here I got the same error again. "print df.columns" also led to the same issue.
"df[1]" - gave a long error with "KeyError: 1" at the end - I can print the entire thing if it it will assist.
Adding the "skipinitialspace=True" - no difference.
I thought the problem might be within the actual csv file information so I deleted all the columns and made my own information and I still got the same error.
Below is a portion of my code as the total code is very long:
enter code here
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as pltdate
import datetime
import matplotlib.animation as animation
import numpy as np
df = pd.read_csv("KIO.csv", skipinitialspace=True)
#df.head()
#Close = df.columns[0]
#df= df.rename(columns={"Close": "Closing"})
df1 = pd.read_csv("USD-ZAR.csv")
kio_close = pd.DataFrame(df.Close)
exchange = pd.DataFrame(df1.Value)
dates = df["Date"]
dates1 = df1["Date"]
The above variables have been used throughout the remaining code though so if this issue can be solved here the remaining code will be right.
This is copy/paste of the error:
Blockquote
Traceback (most recent call last):
File "C:/Users/User/Documents/PycharmProjects/Trading_GUI/GUI_testing.py", line 33, in
kio_close = pd.DataFrame(df.Close)
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 4372, in getattr
return object.getattribute(self, name)
AttributeError: 'DataFrame' object has no attribute 'Close'
Thank you so much in advance.
#Rip_027 This is in regards to your last comment. I used to have the same issue whenever I open a csv file by simply double clicking the file icon. You need to launch Excel first, then get external data. Link below has more details,which will serve as a guideline. Hope this helps.
https://www.hesa.ac.uk/support/user-guides/import-csv

Listbox Python Columns

I am trying to develop a script that allows me to keep my formatting within my listbox.
from Tkinter import *
from tabulate import tabulate
master = Tk()
listbox = Listbox(master)
listbox.pack()
table = [["spam",42],["eggs",451],["bacon",0]]
headers = ["item", "qty"]
tb = tabulate(table, headers, tablefmt="plain")
listbox.insert(END,tb)
mainloop()
End Result the listbox populated with the tb formatting:
QUESTION: HOW DO I GET MY LISTBOX TO APPEAR LIKE THE PICTURE ABOVE THAT I USED TABULATE TO FORMAT?
I've noticed treeview seems to have some limitations with the horizontal box and expanding the columns without adjusting the entire GUI so I'd decided this might be a more shake-shift way that will suit my needs just fine.
One option may be to use str.format() to align each insert into the listbox:
from Tkinter import *
import tkFont
master = Tk()
master.resizable(width=False, height=False)
master.geometry('{width}x{height}'.format(width=300, height=100))
my_font = tkFont.Font(family="Monaco", size=12) # use a fixed width font so columns align
listbox = Listbox(master, width=400, height=400, font=my_font)
listbox.pack()
table = [["spam", 42, "test", ""],["eggs", 451, "", "we"],["bacon", "True", "", ""]]
headers = ["item", "qty", "sd", "again"]
row_format ="{:<8}{sp}{:>8}{sp}{:<8}{sp}{:8}" # left or right align, with an arbitrary '8' column width
listbox.insert(0, row_format.format(*headers, sp=" "*2))
for items in table:
listbox.insert(END, row_format.format(*items, sp=" "*2))
mainloop()
Which appears to match the output you got using tabulate:
Another option could be use a Grid layout.

Bokeh: dashboard not updating axis range based on widget value

I am trying to implement an interactive dashboard in Bokeh with a "play" function that loops through all value pairs for two indicators selected by widgets.
Screen cap of dashboard
While the loop works, the dashboard resets the axis values for each step of the loop. So what I need is to set axis values based on the widget.value selected. To this end, I have built a data frame "ranges" that has the name of the indicator as index and the min/max value for each indicator as columns.
The updates for controls work thusly (x_axis,etc. are the names of the widgets):
controls = [x_axis, y_axis, start_yr, end_yr, years]
for control in controls:
control.on_change('value', lambda attr, old, new: update())
The update function is supposed to update the ranges upon change in the controls like this:
def update():
p.x_range = Range1d(start = ranges.loc[x_axis.value,"Min"],
end = ranges.loc[x_axis.value,"Max"])
p.y_range = Range1d(start = ranges.loc[y_axis.value,"Min"],
end = ranges.loc[y_axis.value,"Max"])
What should happen: Whenever I change the value of the widget, the ranges should update, but other than that, they should remain constant
What does happen: The ranges are set based on the value of the widget initially set and don't change on update.
I've tried to find examples trying to achieve something similar but no luck.
This is a working example:
import numpy as np
from bokeh.plotting import figure
from bokeh.models import Range1d
from bokeh.io import curdoc
x = np.linspace(0, 100, 1000)
y = np.sin(x)
p = figure(x_range=(0, 100))
p.circle(x, y)
def cb():
# this works:
p.x_range.start += 1
p.x_range.end += 1
# this also works:
#p.x_range = Range1d(p.x_range.start+1, p.x_range.end+1)
curdoc().add_periodic_callback(cb, 200)
curdoc().add_root(p)

How would I format data in a PrettyTable?

I'm getting the text from the title and href attributes from the HTML. The code runs fine and I'm able to import it all into a PrettyTable fine. The problem that I face now is that there are some titles that I believe are too large for one of the boxes in the table and thus distort the entire PrettyTable made. I've tried adjusting the hrules, vrules, and padding_width and have not found a resolution.
from bs4 import BeautifulSoup
from prettytable import PrettyTable
import urllib
r = urllib.urlopen('http://www.genome.jp/kegg-bin/show_pathway?map=hsa05215&show_description=show').read()
soup = BeautifulSoup((r), "lxml")
links = [area['href'] for area in soup.find_all('area', href=True)]
titles = [area['title'] for area in soup.find_all('area', title=True)]
k = PrettyTable()
k.field_names = ["ID", "Active Compound", "Link"]
c = 1
for i in range(len(titles)):
k.add_row([c, titles[i], links[i]])
c += 1
print(k)
How I would like the entire table to display as:
print (k.get_string(start=0, end=25))
If PrettyTable can't do it. Are there any other recommended modules that could accomplish this?
This was not a formatting error, but rather the overall size of the table created was so large that the python window could not accommodate all the values on the screen.
This proven by changing to a much smaller font size. If it helps anyone exporting as .csv then arranging in Excel helped.

Freeze header in pandas dataframe

Is there a way by which I can freeze Pandas data frame header { as we do in excel}.So if its a long dataframe with multiple rows we can see the headers once we scroll down!! I am assuming ipython notebook
This function may do the trick:
from ipywidgets import interact, IntSlider
from IPython.display import display
def freeze_header(df, num_rows=30, num_columns=10, step_rows=1,
step_columns=1):
"""
Freeze the headers (column and index names) of a Pandas DataFrame. A widget
enables to slide through the rows and columns.
Parameters
----------
df : Pandas DataFrame
DataFrame to display
num_rows : int, optional
Number of rows to display
num_columns : int, optional
Number of columns to display
step_rows : int, optional
Step in the rows
step_columns : int, optional
Step in the columns
Returns
-------
Displays the DataFrame with the widget
"""
#interact(last_row=IntSlider(min=min(num_rows, df.shape[0]),
max=df.shape[0],
step=step_rows,
description='rows',
readout=False,
disabled=False,
continuous_update=True,
orientation='horizontal',
slider_color='purple'),
last_column=IntSlider(min=min(num_columns, df.shape[1]),
max=df.shape[1],
step=step_columns,
description='columns',
readout=False,
disabled=False,
continuous_update=True,
orientation='horizontal',
slider_color='purple'))
def _freeze_header(last_row, last_column):
display(df.iloc[max(0, last_row-num_rows):last_row,
max(0, last_column-num_columns):last_column])
Test it with:
import pandas as pd
df = pd.DataFrame(pd.np.random.RandomState(seed=0).randint(low=0,
high=100,
size=[200, 50]))
freeze_header(df=df, num_rows=10)
It results in (the colors were customized in the ~/.jupyter/custom/custom.css file):
Old question but wanted to revisit it because I recently found a solution. Use the qgrid module: https://github.com/quantopian/qgrid
This will not only allow you to scroll with the headers frozen but also sort, filter, edit inline and some other stuff. Very helpful.
Try panda's Sticky Headers:
import pandas as pd
import numpy as np
bigdf = pd.DataFrame(np.random.randn(16, 100))
bigdf.style.set_sticky(axis="index")
(this feature was introduced lately, I found it working on pandas 1.3.1, but not on 1.2.4)
A solution that would work on any editor is to select what rows you want to look at:
df.ix[100:110] # would show you from row 101 to 110 keeping the header on top