Error on pandas.read_hdf

Error on pandas.read_hdf - python-2.7

I created an HDF5 file with:
pfad = "E:\Geld\Handelssysteme\Kursdaten\Ivolatity/Daten Monatsoptionen/ODAX_alles.h5"
df.to_hdf(pfad,'df', format='table')
Now I want to read and put a portion of the table back into a dataframe without reading all of the lines in the file.
I tried
df=pandas.read_hdf('pfad', 'df', where = ['expiration<expirations[1] and expiration>=expirations[0]'])
where expirations is a list that contains datetime64[ns] values and I want to get a dataframe where the values in column "expiration" are between expirations[1] and expirations[0].
However, I get a KeyError: 'No object named df in the file'
What would the right syntax be?

The following works instead:
hdf=pandas.HDFStore(pfad)
df=hdf.select('df')

Related

Convert SList to Dataframe

I am reading data from a binary .out file using a python module "SWMMToolbox." The command to read the infilration time series for RG1 from the file.out is as follows:
x = !swmmtoolbox extract 'file.out' subcatchment,RG1,Infiltration_loss
See link for details about swmmtoolbox.
The data type of 'x' is a 'IPython.utils.text.SList'
The data looks like this:
I would like to import this Slist into pandas, but am having trouble. I want to get the datetime string as one column and the value after the comma as another. However, when I use
df = pd.DataFrame(data=x)
I get the following:
I also tried to use
df = pd.DataFrame.from_records(x)
but get this:
I tried to use pd.read_csv, but I couldn't get it to work since 'x' is a variable and not a file.
Any suggestions are much appreciated.

How to change value of any column of semicolon separated csv file using python

Hi I am new in the Python. I want to change value of any column in semicolon separated CSV file. I have following CSV file format:
"S. No.";"name";"number";"status";
"1";"Mac";"54";"ABC";
"2";"Jack";"34";"xyz"; '''
I am using following Python code !Python code
!
However I am getting error "list index out of range".
I have search similar examples but most of them are comma separated CSV file. This code without delimiter specified is working fine for comma separated CSV file. I am getting row value like this
Row [" 1"; "Mac";" 54"; "ABC";] so I can not able to access elements of row list. Please help me to sort out the issue.

Null Byte appending while reading the file through Python pandas

I have created a script which will give you the match rows between the two files. Post that, I am returning the output file to a function, which will be used the file as input to create pivot using pandas.
But somehow, something seems to be wrong, below is the code snippet
def CreateSummary(file):
out_file = file
file_df = pd.read_csv(out_file) ## This function is appending NULL Bytes at
the end of the file
#print file_df.head(2)
The above code is giving me the error as
ValueError: No columns to parse from file
Tried another approach:
file_df = pd.read_csv(out_file,delim_whitespace=True,engine='python')
##This gives me error as
_csv.Error: line contains NULL byte
Any suggestions and criticism is highly appreciated.

How can I calculate mean of list of strings？

I trying to calculate mean of one colum in a csv file.First, I read one column from .csv file and save it into a list. Next when I try to get mean it have a error
TypeError: 'builtin_function_or_method' object has no attribute '__getitem__'
my code is :
with open('XXXXXX.csv') as f:
reader = csv.DictReader(f)
for row in reader:
for (k,v) in row.items():
columns_95[k].append(v)
sVaR5 = columns_95['95%']
mean_95 = sum（sVaR5）/len(sVaR5)
and my csv looks like:
95% 99%
1.225 2.332
1.252 10.252
2.336 4.213
... ...
when I check my list, output is['1.225','1.252','2.336'] I think maybe the quote mark is the reason why my code has error. but how to fix it!Thanks!!!

sum is a function. If you want to call the function sum with the argument sVaR5, you need to write:
sum(sVaR5)
If your sVaR5 is a list of strings, you could convert them to floats for the sum:
sum(map(float, sVaR5))
If you put sum[sVaR5], Python tries to call __getitem__ on the object sum, hence the error
'builtin_function_or_method' object has no attribute '__getitem__'

Pandas HD5-query, where expression fails

I want to query a HDF5-file. I do
df.to_hdf(pfad,'df', format='table')
to write the dataframe on disc.
To read I use
hdf = pandas.HDFStore(pfad)
I have a list that contains numpy.datetime64 values called expirations and try to read the portion of the hd5 table into a dataframe, that has values between expirations[1] and expirations[0] in column "expiration". Column expiration entries have the format Timestamp('2002-05-18 00:00:00').
I use the following command:
df = hdf.select('df',
where=['expiration<expiration[1]','expiration>=expirations[0]'])
However, this fails and produces a value error:
ValueError: The passed where expression: [expiration=expirations[0]]
contains an invalid variable reference
all of the variable refrences must be a reference to
an axis (e.g. 'index' or 'columns'), or a data_column
The currently defined references are: index,columns

Can you try this code:
df = hdf.select('df', where='expiration < expirations[1] and expiration >= expirations[0]')
or, as a query:
df = hdf.query('expiration < #expirations[1] and expiration >= #expirations[0]')
Not sure which one fits best your case, I noticed you are trying to use 'where' to filter rows, without a string or a list, does it make sense ?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Error on pandas.read_hdf - python-2.7

The following works instead: hdf=pandas.HDFStore(pfad) df=hdf.select('df')

Related

Convert SList to Dataframe

How to change value of any column of semicolon separated csv file using python

Null Byte appending while reading the file through Python pandas

How can I calculate mean of list of strings？

Pandas HD5-query, where expression fails

Categories

Resources