What is the difference between a Text instance and string in python? - python-2.7

I am trying to use matplotlib module to plot date vs. some values. I'd like to make some changes to the ticklables of the x_axis and to do that I am using the xaxis.get_ticklables(). As the matplotlib Artist tutorial says this function gives a list of Text instances. Now the question is, what is a Text instance and is there any way I can convert a Text instance to strings or numbers?
Thank you.

You can get the text, i.e. the string, for all tick labels with:
label_texts = [label.get_text() for label in ax.xaxis.get_ticklabels()]
A tick label is rich matplotlib object with many ttributes:
>>> tick_label = ax.xaxis.get_ticklabels()[0]
>>> len(dir(tick_label))
213
Type:
>>> dir(tick_label)
to see their names.

Related

Bokeh tool tip displaying all the values for the line

I am trying to plot a line chart which includes tooltip, but the code below results in displaying all the values of the line in a tooltip instead displaying a single value for those co ordinates
#Import the library
import pandas
import itertools
import bokeh
import MySQLdb
from bokeh.plotting import figure, output_file, show
from bokeh.models import HoverTool
TOOLS='hover'
wells=['F1','F2','F3','F4','F5','F6','F7','F8','F9','F10','F11','F12','G1','G2','G3','G4','G5','G6','G7','G8','G9','G10','G11','G12']
p = figure(plot_width=800, plot_height=640,x_axis_type="datetime", tools=TOOLS)
p.title.text = 'Click on legend entries to hide the corresponding lines'
# Open database connection
db = MySQLdb.connect("localhost","user","password","db" )
#pallete for the lines
my_palette=bokeh.palettes.inferno(len(wells))
#create a statement to get the data
for name, color in zip(wells,my_palette):
stmnt='select date_time,col1,wells,test_value from db where wells="%s"'%(name)
#creating dataframe
df=pandas.read_sql(stmnt,con=db)
p.scatter(df['date_time'], df['test_value'], line_width=2, color=color, alpha=0.8, legend=name,)
#Inserting tool tip
hover = p.select(dict(type=HoverTool))
hover.tooltips = [("Wells","#wells"),("Date","#%s"%(df['date_time'])),("Values","#%s"%(df['test_value']))]
hover.mode = 'mouse'
#Adding a legend
p.legend.location = "top_right"
output_file("interactive_legend.html", title="interactive_legend.py example")
show(p)
Given below is the resultant screenshot
I am trying to get only one well,Date_time,Test_value at given mouse over instance
This code:
hover.tooltips = [
("Wells","#wells"),
("Date","#%s"%(df['date_time'])),
("Values","#%s"%(df['test_value']))
]
Does not do what you think. Let's suppose df['date_time'] has the value [10, 20, 30, 40]. Then after your string substitution, your tooltip looks like:
("Date", "#[10, 20, 30, 40]")
Which exactly explains what you are seeing. The #[10 part looks for a column named "[10" in your ColumnDataSource (because of the # in front). There isn't a column with that name, so the tooltip prints ??? to indicate it can't find data to look up. The rest 20, 30, 40 is just plain text, so it gets printed as-is. In your code, you are actually passing a Pandas series and not a list, so the string substitution also prints the Name and dtype info in the tooltip text as well.
Since you are passing sequence literals to scatter, it creates a Column Data Source for you, and the default names in the CDS it are 'x' and 'y'. My best guess, is that you actually want:
hover.tooltips = [
("Wells","#wells"),
("Date","#x"),
("Values","#y")
]
But note that you would want to do this outside the loop. As it is you are simply modifying the same hover tool over and over.

how to get formula result in excel using xlwings

What i want to do is 1)get a folmula result in excel and 2)update the values to the existing excel file. [ I created and wrote the folmula using "xlsxwriter". But when I tried openpyxl (or pandas) to retrieve the folmula result, it returns 0. I want to use "xlwings" to solve this problem, but no idea how to do it. can anyone help?
#openpyx
wb = openpyxl.load_workbook(filename=xlsx_name,data_only=True)
ws = wb.get_sheet_by_name("sheet1")
print "venn_value",(ws.cell('X2').value)
#pandas
fold_merge_data=pd.read_excel(xlsx_name,sheetname=1)
print fold_merge_data['Venn diagram'][:10]
Yes, xlwings can solve this problem for you because it uses pywin32 objects to interact with Excel, rather than just reading/writing xlsx or csv documents like openpyxl and pandas. This way, Excel actually executes the formula, and xlwings grabs the result.
In order to get the value you can do:
import xlwings as xw
sheet = xw.sheets.active # if the document is open
#otherwise use sheet = xw.Book(r'C:/path/to/file.xlsx').sheets['sheetname']
result = sheet['X2'].value
Also, note that you can set the formula using, for example
sheet['A1'].value = '=1+1' # or ='B1*2' if you want to reference other cells
import xlwings as xw
sheet = xw['Sheet1']
a2_formula = sheet.range('A2').formula
sheet.range('A2:A300').formula = a2_formula #it copys relative
You can use this method for copy formula or value

Python/Pandas: How do I convert from datetime64[ns] to datetime

I have a script that processes an Excel file. The department that sends it has a system that generated it, and my script stopped working.
I suddenly got the error Can only use .str accessor with string values, which use np.object_ dtype in pandas for the following line of code:
df['DATE'] = df['Date'].str.replace(r'[^a-zA-Z0-9\._/-]', '')
I checked the type of the date columns in the file from the old system (dtype: object) vs the file from the new system (dtype: datetime64[ns]).
How do I change the date format to something my script will understand?
I saw this answer but my knowledge about date formats isn't this granular.
You can use apply function on the dataframe column to convert the necessary column to String. For example:
df['DATE'] = df['Date'].apply(lambda x: x.strftime('%Y-%m-%d'))
Make sure to import datetime module.
apply() will take each cell at a time for evaluation and apply the formatting as specified in the lambda function.
pd.to_datetime returns a Series of datetime64 dtype, as described here:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html
df['DATE'] = df['Date'].dt.date
or this:
df['Date'].map(datetime.datetime.date)
You can use pd.to_datetime
df['DATE'] = pd.to_datetime(df['DATE'])

Bokeh glyph coordinates with x_axis_type 'datetime'

I am attempting to add a simple text string (glyph) to a Bokeh plot which uses x_axis_type='datetime'
My code (stripped to its essentials ) is as follows:
p = figure(plot_width=900, plot_height=380, x_axis_type='datetime')
dt = date(2003, 3, 15)
p.line(xvals, yvals)
txt = Text(
# x=some_formatting_function(dt),
x=1057005600000,
y=0.1,
text=["happy day!"],
text_align="left",
text_baseline="middle",
text_font_size="11pt",
text_font_style="italic",
)
p.add_glyph(txt)
show(p)
The x-axis range/values (ie dates) run from 2002 to 2006 and I'd like to add the text in, say, 2003. The x value I've shown in the code above (ie 1057005600000 -- which I've worked out by trial and error) drops the glyph in the right place.
But I cant work out how to use a datetime.date directly...
Is there a bokeh function (or a property of datetime.date) that will give me the value which the bokeh plot is expecting??
Many thanks.
N.B. I've tried using x = bokeh.properties.Date(dt) but this gives me:
ValueError: expected an element of either String,
Dict(String, Either(String, Float)) or Float, got <bokeh.properties.Date object
When the x_axis_type attr is set to 'datetime', Bokeh will plot things along the x-axis according to seconds-since-epoch. The easiest solution is to use datetime.datetime (not .date) and then cast your dt object to seconds-since-epoch using the timestamp() method (which will give the ~1.50e9 number you're getting) then use that for your x-coordinate.
$ from datetime import datetime
$ dt = datetime.now()
$ dt
> datetime.datetime(2015, 6, 17, 10, 41, 34, 617709)
$ dt.timestamp()
> 1434555694.617709
See the following SO question/answer for the python2 answer to my problem:
How can I convert a datetime object to milliseconds since epoch (unix time) in Python?
Thank you #Luke Canavan for pointing me in the right direction(!)

Freeze header in pandas dataframe

Is there a way by which I can freeze Pandas data frame header { as we do in excel}.So if its a long dataframe with multiple rows we can see the headers once we scroll down!! I am assuming ipython notebook
This function may do the trick:
from ipywidgets import interact, IntSlider
from IPython.display import display
def freeze_header(df, num_rows=30, num_columns=10, step_rows=1,
step_columns=1):
"""
Freeze the headers (column and index names) of a Pandas DataFrame. A widget
enables to slide through the rows and columns.
Parameters
----------
df : Pandas DataFrame
DataFrame to display
num_rows : int, optional
Number of rows to display
num_columns : int, optional
Number of columns to display
step_rows : int, optional
Step in the rows
step_columns : int, optional
Step in the columns
Returns
-------
Displays the DataFrame with the widget
"""
#interact(last_row=IntSlider(min=min(num_rows, df.shape[0]),
max=df.shape[0],
step=step_rows,
description='rows',
readout=False,
disabled=False,
continuous_update=True,
orientation='horizontal',
slider_color='purple'),
last_column=IntSlider(min=min(num_columns, df.shape[1]),
max=df.shape[1],
step=step_columns,
description='columns',
readout=False,
disabled=False,
continuous_update=True,
orientation='horizontal',
slider_color='purple'))
def _freeze_header(last_row, last_column):
display(df.iloc[max(0, last_row-num_rows):last_row,
max(0, last_column-num_columns):last_column])
Test it with:
import pandas as pd
df = pd.DataFrame(pd.np.random.RandomState(seed=0).randint(low=0,
high=100,
size=[200, 50]))
freeze_header(df=df, num_rows=10)
It results in (the colors were customized in the ~/.jupyter/custom/custom.css file):
Old question but wanted to revisit it because I recently found a solution. Use the qgrid module: https://github.com/quantopian/qgrid
This will not only allow you to scroll with the headers frozen but also sort, filter, edit inline and some other stuff. Very helpful.
Try panda's Sticky Headers:
import pandas as pd
import numpy as np
bigdf = pd.DataFrame(np.random.randn(16, 100))
bigdf.style.set_sticky(axis="index")
(this feature was introduced lately, I found it working on pandas 1.3.1, but not on 1.2.4)
A solution that would work on any editor is to select what rows you want to look at:
df.ix[100:110] # would show you from row 101 to 110 keeping the header on top