List IndexError after processing all data with Python 3 - list

Here is the code:
name = input("Enter Molecule ID: ")
name_in = name+'.lac.dat'
print(name_in)
atm_chg = []
with open(name_in) as f:
# skip two lines
f.readline()
f.readline()
for line in f.readlines():
atm_chg.append(float( line.split()[-1] ))
This is to process input for a larger Python program.
The input is:
LOEWDIN ATOMIC CHARGES
----------------------
0 C : -0.780631
1 H : 0.114577
2 Br: 0.309802
3 Cl: 0.357316
4 F : -0.001065
Finally the runtime messages are:
runfile('/home/comp/Apps/Python/Testing/ReadFile_2.py', wdir='/home/comp/Apps/Python/Testing')
Enter Molecule ID: A
A.lac.dat
Traceback (most recent call last):
File "<ipython-input-1-8c665940b39f>", line 1, in <module>
runfile('/home/comp/Apps/Python/Testing/ReadFile_2.py', wdir='/home/comp/Apps/Python/Testing')
File "/home/comp/Apps/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "/home/comp/Apps/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/comp/Apps/Python/Testing/ReadFile_2.py", line 27, in <module>
atm_chg.append(float( line.split()[-1] ))
IndexError: list index out of range
In spite of the errors there is an entry in the Variable Explorer (Spyder IDE):
[-0.780631, 0.114577, 0.309802, 0.357316, -0.001065]
which is exactly what I require.

The program is probably crashing because it's trying to process an empty line beyond the last line of data. You can fix this by making sure the line is non-empty before processing it:
for line in f.readlines():
if line.strip(): # is the line non-empty, ignoring white space
atm_chg.append(float( line.split()[-1] ))

Related

IndexError: list index out of range Not getting Solved

import glob
from bs4 import BeautifulSoup
f = open('csvfile.csv','w')
for file in glob.glob('*.htm'):
print 'Processing', file
for y in range(0,3):
for x in range(0, 6):
soup = BeautifulSoup(open(file).read())
all_string=soup.find_all("h2")[x].get_text()
#stack=[]
#acct.write(", ".join(stack) + '\n')
f.write(all_string)
f.write('\n')
print(all_string)
x=0
f.close()
Output-
Processing Alkali-Controlled C–H Cleavage or N–C Bond Formation by N2-Derived Iron Nitrides and Imides - Journal of the American Chemical Society (ACS Publications).htm
Abstract
Supporting Information
Vanadium-catalyzed Reduction of Molecular Dinitrogen into Silylamine under Ambient Reaction Conditions
Traceback (most recent call last):
File "", line 1, in
runfile('/Users/ROXX/Desktop/project/csv1.py', wdir='/Users/ROXX/Desktop/project')
File
"/Users/ROXX/anaconda/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py",
line 880, in runfile
execfile(filename, namespace)
File
"/Users/ROXX/anaconda/lib/python2.7/site-packages/spyder/utils/site/sitecustomize.py",
line 94, in execfile
builtins.execfile(filename, *where)
File "/Users/ROXX/Desktop/project/csv1.py", line 17, in
all_string=soup.find_all("h2")[x].get_text()
IndexError: list index out of range
The reason for the error probably is that there are less than 7 occurrences of h2 in the file you are processing (in the second loop). Repeating on "all_string" instead of a fixed interval would probably solve the issue.

Tensorboad - add_summary makes my code crashes

I am very new to tensorflow, and I try to display my first tensorboard.
I downloaded and executed ok the board for the example given here
https://www.tensorflow.org/versions/r0.7/how_tos/summaries_and_tensorboard/index.html
Following the method, I have in my code:
weights_hidden = tf.Variable(tf.truncated_normal([image_size * image_size, 1024]), name='weights_hidden')
_ = tf.histogram_summary('weights_hidden', weights_hidden)
and when I run the session
with tf.Session(graph=graph) as session:
merged = tf.merge_all_summaries()
writer = tf.train.SummaryWriter("/tmp/test", session.graph_def)
tf.initialize_all_variables().run()
for step in range(num_steps):
summary_str, l, predictions = session.run(
[optimizer, loss, train_prediction], feed_dict=feed_dict)
if (step % 500 == 0):
writer.add_summary(summary_str, step)
The process crashes with the following error
Traceback (most recent call last):
File "/home/xxx/Desktop/xxx/xxx.py", line 108, in <module>
writer.add_summary(summary_str, step)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/summary_io.py", line 128, in add_summary
event = event_pb2.Event(wall_time=time.time(), summary=summary)
File "/usr/local/lib/python2.7/dist-packages/google/protobuf/internal/python_message.py", line 522, in init
_ReraiseTypeErrorWithFieldName(message_descriptor.name, field_name)
File "/usr/local/lib/python2.7/dist-packages/google/protobuf/internal/python_message.py", line 453, in _ReraiseTypeErrorWithFieldName
six.reraise(type(exc), exc, sys.exc_info()[2])
File "/usr/local/lib/python2.7/dist-packages/google/protobuf/internal/python_message.py", line 520, in init
copy.MergeFrom(new_val)
File "/usr/local/lib/python2.7/dist-packages/google/protobuf/internal/python_message.py", line 1237, in MergeFrom
"expected %s got %s." % (cls.__name__, type(msg).__name__))
TypeError: Parameter to MergeFrom() must be instance of same class: expected Summary got NoneType. for field Event.summary
What am I missing ?
Any help/comment would be very welcome
Thank you very much for the help
K.
You should write:
_, summary_str, l, predictions = session.run(
[optimizer, merged, loss, train_prediction], feed_dict=feed_dict)
I added a 4th argument merged which corresponds to the summary you are trying to get (you were only getting the result of the optimization step).

reindent.py - Does not work from the command line

I have problems with indentation in Python. So I downloaded reindent.py to correct the indentation errors.
I installed reindent.py using the following command-:
pip install reindent
But I running it from the command line shows me the following error-:
Traceback (most recent call last):
File "/usr/local/bin/reindent", line 3, in <module>
main()
File "/usr/local/lib/python2.7/dist-packages/reindent.py", line 92, in main
check(arg)
File "/usr/local/lib/python2.7/dist-packages/reindent.py", line 118, in check
if r.run():
File "/usr/local/lib/python2.7/dist-packages/reindent.py", line 177, in run
tokenize.tokenize(self.getline, self.tokeneater)
File "/usr/lib/python2.7/tokenize.py", line 170, in tokenize
tokenize_loop(readline, tokeneater)
File "/usr/lib/python2.7/tokenize.py", line 176, in tokenize_loop
for token_info in generate_tokens(readline):
File "/usr/lib/python2.7/tokenize.py", line 357, in generate_tokens
("<tokenize>", lnum, pos, line))
File "<tokenize>", line 127
for w in transcript:
^
IndentationError: unindent does not match any outer indentation level
I am running it with the following command-:
reindent -n test1.py
I thought reindent was supposed to correct the errors not show me where they occurred.
reindent.py changes tabs to spaces and can make irregular indentation a uniform 4-spaces. It does not attempt to catch or fix IndentationErrors.
Consider this code which has an IndentationError:
def foo():
print("Let's go")
for i in range(2): <-- IndentationError
print('Peay')
It produces a similar error message to the one you are getting:
% reindent.py script.py
Traceback (most recent call last):
...
File "/usr/lib/python2.7/tokenize.py", line 170, in tokenize
tokenize_loop(readline, tokeneater)
File "/usr/lib/python2.7/tokenize.py", line 176, in tokenize_loop
for token_info in generate_tokens(readline):
File "/usr/lib/python2.7/tokenize.py", line 357, in generate_tokens
("<tokenize>", lnum, pos, line))
File "<tokenize>", line 9
for i in range(2):
^
IndentationError: unindent does not match any outer indentation level
Both
def foo():
print("Let's go")
for i in range(2):
print('Peay')
and
def foo():
print("Let's go")
for i in range(2):
print('Peay')
are valid ways to fix the code. reindent.py (or the tokenize module that it
relies on) does not attempt to guess which one the coder intended. Thus,
IndentationErrors are SyntaxErrors that at least sometimes require human
intervention to fix.

pandas get_group memory error

I am using pandas v0.14.1 with python 2.7
I have a groupby object and I am trying to pull out a group identified by particular key. The key is in fact in the group:
>>> key in key_groups.groups.keys()
True
but when I try to make the get_group call it fails with a memory error:
>>>> key_groups.get_group(key)
*** MemoryError:
The full stacktrace is:
Traceback (most recent call last):
File "main.py", line 141, in <module>
main(num_days=arguments.days, num_variants=arguments.variants)
File "main.py", line 76, in main
problem, solution = Solver.Solve(request, num_variants)
File "/srv/compunctuator/src/Solver.py", line 49, in Solve
solution = attempt_minimization(t)
File "/srv/compunctuator/src/Solver.py", line 41, in attempt_minimization
t.scruple()
File "/srv/compunctuator/src/Compunctuator.py", line 136, in scruple
self.__iterate__()
File "/srv/compunctuator/src/Compunctuator.py", line 95, in __iterate__
self.__maximize_impressions__()
File "/srv/compunctuator/src/Compunctuator.py", line 583, in __maximize_impressions__
df = key_groups.get_group(key)
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 573, in get_group
inds = self._get_index(name)
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 429, in _get_index
sample = next(iter(self.indices))
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 414, in indices
return self.grouper.indices
File "properties.pyx", line 34, in pandas.lib.cache_readonly.__get__ (pandas/lib.c:36380)
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 1253, in indices
return _get_indices_dict(label_list, keys)
File "/srv/compunctuator/.virtualenvs/compunctuator/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 3474, in _get_indices_dict
np.prod(shape))
File "algos.pyx", line 1997, in pandas.algos.groupsort_indexer (pandas/algos.c:37521) MemoryError
If I actually use the dictionary lookup I can get the indices out:
>>>> key_groups.groups[key]
[0, 2]
It seems like everything should work here.
I realize a similar question was asked here pandas get_group causes memory error
but it was never resolved and I thought I could give more details if necessary.

use random forest to classifier review, but hat key error?

I have follow code in python:
from sklearn.ensemble import RandomForestClassifier
forest = RandomForestClassifier(n_estimators = 100)
forest = forest.fit( train_data_features, train["sentiment"] )
but have key error for "sentiment", I don't know why,
train = pd.read_csv("labeledTrainData.tsv", header=0, delimiter="\t", quoting=3)
-Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site--packages/pandas/core/frame.py", line 1780, in __getitem__
return self._getitem_column(key)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/frame.py", line 1787, in _getitem_column
return self._get_item_cache(key)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/generic.py", line 1068, in _get_item_cache
values = self._data.get(item)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/internals.py", line 2849, in get
loc = self.items.get_loc(item)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pandas/core/index.py", line 1402, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3807)
File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3687)
File "pandas/hashtable.pyx", line 696, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12310)
File "pandas/hashtable.pyx", line 704, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12261)
KeyError: 'sentiment'
Are you doing the Kaggle competition? https://www.kaggle.com/c/word2vec-nlp-tutorial/data
Are you sure you have downloaded and decompressed the file ok? The first part of the file reads:
id sentiment review
"5814_8" 1 "With all this stuff go
This works for me:
>>> train = pd.read_csv("labeledTrainData.tsv", delimiter="\t")
>>> train.columns
Index([u'id', u'sentiment', u'review'], dtype='object')
>>> train.head(3)
id sentiment review
0 5814_8 1 With all this stuff going down at the moment w...
1 2381_9 1 \The Classic War of the Worlds\" by Timothy Hi...
2 7759_3 0 The film starts with a manager (Nicholas Bell)...
You should check the columns are setup correctly in the train variable. You should have a sentiment column. That column seems to be missing in your dataframe.