#!/usr/bin/python
import sys
def Count_Totals(lst): # returns an array with the number of elements in each level on nesting
for el in lst:
if type(el)==list:
temp=el
index = index + 1 # this is index of Counts which indicates the level of nesting
Count_Totals(el)
else:
Counts[index] = Counts[index] + 1 # When we reach the bottom nest to count elements
Size = Size + 1
if Size == len(temp) - 1:
index = index - 1 # When list inside list runs out of elements
in_file = sys.argv[1]
with open(in_file,'r') as input_file:
lines = input_file.readlines() # reads entire ascii file and
saves into a list called lines
Frame = 0
Size = 0
index = 0
for line in lines: # creates a list of each row in text file
Counts = []
Frame = Frame + 1
Count_Totals(line) #Counts the elements in each level of nest
print("The Frame contains: %d subrames, which is %d Symbols", Count[0] , Count [1])
Hello, I am trying to write a python 2.7 program which takes in a text file that has nested lists and count the number of items in each level of list and outputs the values as a print statement. I have written the preceding code using recursion but I am having trouble running with the following error:
UnboundLocalError: local variable 'index' referenced before assignment
My understanding is that the index inside the function is outside the scope of the index variable I initialized at the bottom. Any help to fix this code or get it up and running would be greatly appreciated.
It looks to me like you don't want to access a local variable named index. You want to access a global named index, which is what you initialized to 0 later on.
When python sees a variable used for the first time in a function, it has to figure out whether you intended it to be a global or a local. It errs on the side of it being a local. In particular, if you assign to the value, it assumes it is a local (the exact rules are here).
To fix this, all you have to do is add a global statement into your function. That tells Python that the specified variables are actually globals.
def Count_Totals(lst):
global index
for el in lst:
Related
I have just started doing my first research project, and I have just begun programming (approximately 2 weeks ago). Excuse me if my questions are naive. I might be using python very inefficiently. I am eager to improve here.
I have experimental data that I want to analyse. My goal is to create a python script that takes the data as input, and that for output gives me graphs, where certain parameters contained in text files (within the experimental data folders) are plotted and fitted to certain equations. This script should be as generalizable as possible so that I can use it for other experiments.
I'm using the Anaconda, Python 2.7, package, which means I have access to various libraries/modules related to science and mathematics.
I am stuck at trying to use For and While loops (for the first time).
The data files are structured like this (I am using regex brackets here):
.../data/B_foo[1-7]/[1-6]/D_foo/E_foo/text.txt
What I want to do is to cycle through all the 7 top directories and each of their 6 subdirectories (named 1,2,3...6). Furthermore, within these 6 subdirectories, a text file can be found (always with the same filename, text.txt), which contain the data I want to access.
The 'text.txt' files is structured something like this:
1 91.146 4.571 0.064 1.393 939.134 14.765
2 88.171 5.760 0.454 0.029 25227.999 137.883
3 88.231 4.919 0.232 0.026 34994.013 247.058
4 ... ... ... ... ... ...
The table continues down. Every other row is empty. I want to extract information from 13 rows starting from the 8th line, and I'm only interested in the 2nd, 3rd and 5th columns. I want to put them into lists 'parameter_a' and 'parameter_b' and 'parameter_c', respectively. I want to do this from each of these 'text.txt' files (of which there is a total of 7*6 = 42), and append them to three large lists (each with a total of 7*6*13 = 546 items when everything is done).
This is my attempt:
First, I made a list, 'list_B_foo', containing the seven different 'B_foo' directories (this part of the script is not shown). Then I made this:
parameter_a = []
parameter_b = []
parameter_c = []
j = 7 # The script starts reading 'text.txt' after the j:th line.
k = 35 # The script stops reading 'text.txt' after the k:th line.
x = 0
while x < 7:
for i in range(1, 7):
path = str(list_B_foo[x]) + '/%s/D_foo/E_foo/text.txt' % i
m = open(path, 'r')
line = m.readlines()
while j < k:
line = line[j]
info = line.split()
print 'info:', info
parameter_a.append(float(info[1]))
parameter_b.append(float(info[2]))
parameter_c.append(float(info[5]))
j = j + 2
x = x + 1
parameter_a_vect = np.array(parameter_a)
parameter_b_vect = np.array(parameter_b)
parameter_c_vect = np.array(parameter_c)
print 'a_vect:', parameter_a_vect
print 'b_vect:', parameter_b_vect
print 'c_vect:', parameter_c_vect
I have tried to fiddle around with indentation without getting it to work (receiving either syntax error or indentation errors). Currently, I get this output:
info: ['1', '90.647', '4.349', '0.252', '0.033', '93067.188', '196.142']
info: ['.']
Traceback (most recent call last):
File "script.py", line 104, in <module>
parameter_a.append(float(info[1]))
IndexError: list index out of range
I don't understand why I get the "list index out of range" message. If anyone knows why this is the case, I would be happy to hear you out.
How do I solve this problem? Is my approach completely wrong?
EDIT: I went for a pure while-loop solution, taking RebelWithoutAPulse and CamJohnson26's suggestions into account. This is how I solved it:
parameter_a=[]
parameter_b=[]
parameter_c=[]
k=35 # The script stops reading 'text.txt' after the k:th line.
x=0
while x < 7:
y=1
while y < 7:
j=7
path1 = str(list_B_foo[x]) + '/%s/pdata/999/dcon2dpeaks.txt' % (y)
m = open(path, 'r')
lines = m.readlines()
while j < k:
line = lines[j]
info = line.split()
parameter_a.append(float(info[1]))
parameter_b.append(float(info[2]))
parameter_c.append(float(info[5]))
j = j+2
y = y+1
x = x+1
Meta: I am not sure If I should give the answer to the person who answered the quickest and who helped me finish my task. Or the person with the answer which I learned most from. I am sure this is a common issue that I can find an answer to by reading the rules or going to Stackexchange Meta. Until I've read up on the recomendations, I will hold off on marking the question as answered by any of you two.
Welcome to stack overflow!
The error is due to name collision that you inadvertenly have created. Note the output before the exception occurs:
info: ['1', '90.647', '4.349', '0.252', '0.033', '93067.188', '196.142']
info: ['.']
Traceback (most recent call last):
...
The line[1] cannot compute - there is no "1"-st element in the list, containing only '.' - in python the lists start with 0 position.
This happens in your nested loop,
while j < k
where you redefine the very line you read previously created:
line = m.readlines()
while j < k:
line = line[j]
info = line.split()
...
So what happens is on first run of the loop, your read the lines of the files into line list, then you take one line from the list, assign it to line again, and continue with the loop. At this point line contains a string.
On the next run reading from line via specified index reads the character from the string on the j-th position and the code malfunctions.
You could fix this with different naming.
P.S. I would suggest using with ... as ... syntax while working with files, it is briefly described here - this is called a context manager and it takes care of opening and closing the files for you.
P.P.S. I would also suggest reading the naming conventions
Looks like you are overwriting the line array with the first line of the file. You call line = m.readlines(), which sets line equal to an array of lines. You then set line = line[j], so now the line variable is no longer an array, it's a string equal to
1 91.146 4.571 0.064 1.393 939.134 14.765
This loop works fine, but the next loop will treat line as an array of chars and take the 4th element, which is just a period, and set it equal to itself. That explains why the info variable only has one element on the second pass through the loop.
To solve this, just use 2 line variables instead of one. Call one lines and the other line.
lines = m.readlines()
while j < k:
line = lines[j]
info = line.split()
May be other errors too but that should get you started.
I am trying to run the following code:
fname = raw_input ('Enter file name:')
fh = open (fname)
count = 0
for line in fh:
if not line.startswith ('X-DSPAM-Confidence:') : continue
else:
count = count + 1
new = fh #this new = fh is supposed to be fh stripped of the non- x-dspam lines
for line in new: # this seperates the lines in new and allows `finding the floats on each line`
numpos = new.find ('0')
endpos = new.find ('5', numpos)
num = new[numpos:endpos + 1]
float (num)
# should now have a list of floats
print num
The intention of this code is to prompt the user for a file name, open the file, read through the file, compile all the lines that start with X-DSPAM, and extract the float number on these lines. I am fairly new to coding so I realise I may have committed a number of errors, but currently when I try to run it, after putting in the file name I get the return:
I looked around and I have seen that mode 'r' refers to different file modes in python in relation to how the end of the line is handled. However the code I am trying to run is similar to other code I have formulated and it does not have any non-text files inside, the file being opened is a .txt file. Is it something to do with converting a list of strings line by line to a list of float numbers?
Any ideas on what I am doing wrong would be appreciated.
The default mode of handling a file is 'r' - which means 'read', which is what you want. It means the program is going to read the file (as opposed to 'w' - write, or 'a' - append, for example - which would allow you to overwrite the file or append to it, which you don't want in this case).
There are some bugs in your code, which I've tried to indicate in the edited code below.
You don't need to assign new = fh - you're not grabbing lines and passing them to a new file. Rather, you're checking each line against the 'XDSPAM' criteria and if it's a match, you can proceed to parse out the desired numbers. If not, you ignore it and go to the next line.
With that in mind, you can move all of the code from the for line in new to be part of the original if not ... else block.
How you find the end of the number is also a bit off. You set endpos by searching for an occurence of the number 5 - but what I think you want is to find a position 5 characters from the start position (numpos + 5).
(There are other ways to parse the line and pull the number, but I'm going to stick with your logic as indicated by your code, so nothing fancy here.)
You can convert to float in the same statement where you slice the number from the line (as below). It's acceptable to do:
num = line[numpos:endpos+1]
float_num = float(num)
but not necessary. In any event, you want to assign the conversion (float(num)) to a variable - just having float(num) doesn't allow you to pass the converted value to another statement (including print).
You say that you should have 'a list of floats' - the code as corrected below - will give you a display of all the floats, but if you want an actual Python list, there are other steps involved. I don't think you wanted a Python list, but just in case:
numlist = [] # at the beginning, declare a new, empty list
...
# after converting to float, append number to list
XDSPAM.append(num)
print XDSPAMs # at end of program, to print full list
In any event, this edited code works for me with an appropriate file of test data, and outputs the desired float numbers:
fname = raw_input ('Enter file name:')
fh = open (fname)
count = 0
for line in fh:
if not line.startswith ('X-DSPAM-Confidence:') : continue
else:
# there's no need to create the 'new' variable
# any lines that meet the criteria can be processed for numbers
count = count + 1
numpos = line.find ('0')
# i think what you want here is to set an endpoint 5 positions to the right
# but your code was looking for the position of a '5' in the line
endpos = numpos + 5
# you can convert to float and slice in the same statement
num = float(line[numpos:endpos+1])
print num
In my High school assignment part of it is to make a function that will find the average number in a list of floating points. We can't use len and such so something like sum(numList)/float(len(numList)) isn't an option for me. I've spent an hour researching and racking my brain for a way to find the list length without using the len function, and I've got nothing so I was hoping to be either shown how to do it or to be pointed in the right direction. Help me stack overflow, your my only hope. :)
Use a loop to add up the values from the list, and count them at the same time:
def average(numList):
total = 0
count = 0
for num in numList:
total += num
count += 1
return total / count
If you might be passed an empty list, you might want to check for that first and either return a predetermined value (e.g. 0), or raise a more helpful exception than the ZeroDivisionError you'll get if you don't do any checking.
If you're using Python 2 and the list might be all integers, you should either put from __future__ import division at the top of the file, or convert one of total or count to a float before doing the division (initializing one of them to 0.0 would also work).
Might as well show how to do it with a while loop since it's another opportunity to learn.
Normally, you won't need counter variable(s) inside of a for loop. However, there are certain cases where it's helpful to keep a count as well as retrieve the item from the list and this is where enumerate() comes in handy.
Basically, the below solution is what #Blckknght's solution is doing internally.
def average(items):
"""
Takes in a list of numbers and finds the average.
"""
if not items:
return 0
# iter() creates an iterator.
# an iterator has gives you the .next()
# method which will return the next item
# in the sequence of items.
it = iter(items)
count = 0
total = 0
while True:
try:
# when there are no more
# items in the list
total += next(it)
# a stop iteration is raised
except StopIteration:
# this gives us an opportunity
# to break out of the infinite loop
break
# since the StopIteration will be raised
# before a value is returned, we don't want
# to increment the counter until after
# a valid value is retrieved
count += 1
# perform the normal average calculation
return total / float(count)
def length_of_list(my_list):
if not my_list:
return 0
return 1+length_of_list(my_list[1:])
I'm a new programmer and I'm having a difficult time finishing up my 4th program. The premise was to create a program that would take input from the user, creating a list then compares this list to a tuple. After it prints a statement letting the user know which items they chose correspond to the items in the tuple and also in which position they are in the tuple.
The problem I'm having is the last part, I can't get the correct position to print right and I fail to understand why. For example, if someone chose GPS correctly during their guesses, it should print position 0, but it doesn't. If water is chosen, it says it's in position 13...but it should be 5.
#here is the code I have so far:
number_items_input = 0
guessed_inventory_list = [] #this is the variable list that will be input by user
survival_gear = () #this is the tuple that will be compared against
survival_gear = ("GPS","map","compass","firstaid","water","flashlight","lighter","blanket","rope","cell phone","signal mirror")
#block bellow takes input from the user
print("Please choose one by one, which top 10 items do you want with you in case of a survival situation, think Bear Grylls. Once chosen, your list will be compared to the top 10 survival items list.")
while number_items_input < 10:
print("Please choose.")
guessed_items = input()
guessed_inventory_list.append(guessed_items)
number_items_input = number_items_input + 1
print ("You have chosen the following:", guessed_inventory_list)
#block of code below here compares the input to the tuple
t = 1
while t < 1:
t = t + 1
for individual_items in guessed_inventory_list:
for top_items in survival_gear:
if individual_items == top_items:
#finally the print statements below advise the user if they guessed an item and which position it's in.
print ("You have chosen wisely", top_items)
print ("It's in position", t, "on the survival list")
t = t + 1
The reason you are getting a wrong index is because of the wrong nesting of loops , your outer loop should be the tuple you wish to compare and the inner loop should be the list generated from the input where as in this case it is reverse, see the below corrected code snippet
Code snippet:
for top_items in survival_gear:
for individual_items in guessed_inventory_list:
if individual_items == top_items:
#finally the print statements below advise the user if they guessed an item and which position it's in.
print ("You have chosen wisely", top_items)
print ("It's in position", t, "on the survival list")
t = t + 1
The above code snippet should solve your problem , but your code contains
while loop which can be avoided using the range built in function
Incrementing the variable t manually can be avoided by using enumerate built in function
The nested forloop and if loop can be replaced by using the "in" membership test operator
Find the below updated code:
#!/usr/bin/python
number_items_input = 0
guessed_inventory_list = [] #this is the variable list that will be input by user
survival_gear = ("GPS","map","compass","firstaid","water","flashlight","lighter","blanket","rope","cell phone","signal mirror")
#block bellow takes input from the user
print("Please choose one by one, which top 10 items do you want with you in caseof a survival situation, think Bear Grylls.Once chosen, your list will be compared to the top 10 survival items list.")
# One can use range functions to loop n times in this case 10 times
for i in range(0,10):
guessed_items = raw_input("Please choose:")
guessed_inventory_list.append(guessed_items)
print ("You have chosen the following:", guessed_inventory_list)
# Enumerate is one of the built-in Python functions.
# It returns an enumerate object.
# In this case that object is a list of tuples (immutable lists),
# each containing a pair of count/index and value.
# like [(1, 'GPS'), (2, 'map'), (3, 'compass'),...,(6, 'signal mirror')]
# in the below for loop the list of tuple will be
#unpacked in to t and individual_items for each iteration
for t,individual_items in enumerate(survival_gear,start=1):
#the "in" is a membership test operator which will test whether
#individual_items is in list guessed_inventory_list
if individual_items in guessed_inventory_list:
#finally the print statements below advise the user if they guessed an item and which position it's in.
print("You have chosen wisely", individual_items)
print("It's in position", t, "on the survival list")
I am having trouble understanding how one of the for loops works in Learn Python the Hard Way ex.41. http://learnpythonthehardway.org/book/ex41.html Below is the code from the lesson.
The loop that I am confused about is for i in range(0, snippet.count("###")):
Is it iterating over a range of 0 to snippet (of which there are 6 snippet), and adding the extra value of the count of "###"? So for the next line of code param_count = random.randint(1,3) the extra value of "###" is applied? Or am I way off!?
Cheers
Darren
import random
from urllib import urlopen
import sys
WORD_URL = "http://learncodethehardway.org/words.txt"
WORDS = []
PHRASES = {
"class %%%(%%%):":
"Make a class named %%% that is-a %%%.",
"class %%%(object):\n\tdef __init__(self, ***)" :
"class %%% has-a __init__ that takes self and *** parameters.",
"class %%%(object):\n\tdef ***(self, ###)":
"class %%% has-a function named *** that takes self and ### parameters.",
"*** = %%%()":
"Set *** to an instance of class %%%.",
"***.***(###)":
"From *** get the *** function, and call it with parameters self, ###.",
"***.*** = '***'":
"From *** get the *** attribute and set it to '***'."
}
# do they want to drill phrases first
PHRASE_FIRST = False
if len(sys.argv) == 2 and sys.argv[1] == "english":
PHRASE_FIRST = True
# load up the words from the website
for word in urlopen(WORD_URL).readlines():
WORDS.append(word.strip())
def convert(snippet, phrase):
class_names = [w.capitalize() for w in
random.sample(WORDS, snippet.count("%%%"))]
other_names = random.sample(WORDS, snippet.count("***"))
results = []
param_names = []
for i in range(0, snippet.count("###")):
param_count = random.randint(1,3)
param_names.append(', '.join(random.sample(WORDS, param_count)))
for sentence in snippet, phrase:
result = sentence[:]
# fake class names
for word in class_names:
result = result.replace("%%%", word, 1)
# fake other names
for word in other_names:
result = result.replace("***", word, 1)
# fake parameter lists
for word in param_names:
result = result.replace("###", word, 1)
results.append(result)
return results
# keep going until they hit CTRL-D
try:
while True:
snippets = PHRASES.keys()
random.shuffle(snippets)
for snippet in snippets:
phrase = PHRASES[snippet]
question, answer = convert(snippet, phrase)
if PHRASE_FIRST:
question, answer = answer, question
print question
raw_input("> ")
print "ANSWER: %s\n\n" % answer
except EOFError:
print "\nBye"
snippet.count("###") returns the number of times "###" appears in snippet.
If "###" appears 6 times, then the for-loop iterates from 0 to 6.
"try except" block runs the program until the user hits ^ D.
"While True" loop inside "try" stores list of keys from PHRASES dictonary into snippets. The order of keys is different each time (because of shuffle method). "for loop" inside that "While loop" is to go through each snippet and call convert method on key and value of that snippet.
All "convert method" does it to replace %%%, ***, and ### of that key and value with a random word from the url list of words and return a list (results) consists of two strings: one made from the key and one made from the value.
Then the program prints one of the strings as a question, then gets user input (using raw_input("> ")), but no matter what the user entered, it prints the other returned string as the answer.
Inside convert method, we have three different lists : class_names, other_names, and param_names.
To make class_names, the program counts the number of %%% isnide that key (or value, but they are the same numbers of %%% in them anyways). class_names will be a random list of words in the size of the count of %%%.
other_names is a random list of words again. How many words? in the number of *** found in key (or value, does not matter which one because it is the same in any pairs of them)
param_names is a list of strings in the size of the number of ### found. Each string consists of one, two or three different words seperated by ,.
'result' is a string. The program goes over the three lists (class_names, param_names and other_names), and replace something in result string with what it already made ready for it. Then append this into results list. The (for sentence in snippet, phrase:) loop runs two times because 'snippet' and 'phrase' are two different strings. So, 'result' string is being made two times (one for question one for answer).
I put one part of this program to a smaller sub program to clarify how a list of a certain size from random words in the url is created:
https://github.com/MahshidZ/python-recepies/blob/master/random_word_set.py
Finally, I suggest to put print statements any where in code that you need to understand better. An an example, for this code I printed a number of variables to get exactly what is going on. This is a good way of debugging without a debugger: (look for the boolean variable DEBUG in my code)
DEBUG = 1
if DEBUG:
print "snippet: " , snippet
print "phrase: ", phrase
print "class names: ", class_names
print "other names: " , other_names
print "param names: ", param_names