Joining two lists (alternating) - list

I have two lists that I am joining but running into issues.
First list is called header
header= list(last)
header
['AMC', 'AMD', 'EDU', 'F', 'FCEL', 'LCID']
Second list is called
lastlist= lastr.values.tolist()
lastlist
[[22.418, 1.627, 0.121, 2.365, 1.019, 4.574]]
To add the two lists to gether, we use the zip function.
master = []
for sym, num in zip(header, lastlist):
master.append(' '.join((sym, num)))
print(master)
However, unfortunately I get this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last) <ipython-input-195-33ee9da2ea12> in <module>
2 3 for sym, num in zip(header, lastlist): ----> 4 master.append(' '.join((sym, num))) 5 6 print(master)
TypeError: sequence item 1: expected str instance, list found
I get the error above. I want to my new list to be in this format
master = [AMC 22.418, AMD 1.627,...]
Any help would be appreciated.
Thanks
Yasser

Try this one instead, you miss some quote ( and str(num):
If there is any questions, please ask.
for sym, num in zip(header, lastlist[0]):
master.append((' '.join((sym, str(num)))))
print(master)
Output:
['AMC 22.418', 'AMD 1.627', 'EDU 0.121', 'F 2.365', 'FCEL 1.019', 'LCID 4.574']

Related

Python strip() and readlines()

I have a code that I am trying to run which will compare a value from a csv file to a threshold that I have set within the py file.
My csv file has an output similar to below, but with 1030 lines
-46.62
-47.42
-47.36
-47.27
-47.36
-47.24
-47.24
-47.03
-47.12
Note: there are no lines between the values but there is a single space before them.
My first attempt was with this code:
file_in5 = open('710_edited_capture.csv', 'r')
line5=file_in5.readlines()
a=line5[102]
b=line5[307]
c=line5[512]
d=line5[717]
e=line5[922]
print[a]
print[b]
print[c]
print[d]
print[e]
which gave the output of:
[' -44.94\n']
[' -45.06\n']
[' -45.09\n']
[' -45.63\n']
[' -45.92\n']
My first thought was to use .strip() to remove the space and the \n but this is not supported in lists and returns the error:
Traceback (most recent call last):
File "/root/test.py", line 101, in <module>
line5=line5.strip()
AttributeError: 'list' object has no attribute 'strip'
My next code below:
for line5 in file_in5:
line5=line5.strip()
line5=file_in5.readlines()
a=line5[102]
b=line5[307]
c=line5[512]
d=line5[717]
e=line5[922]
print[a]
print[b]
print[c]
print[d]
print[e]
Returns another error:
Traceback (most recent call last):
File "/root/test.py", line 91, in <module>
line5=file_in5.readlines()
ValueError: Mixing iteration and read methods would lose data
What is the most efficient way to read in just 5 specific lines without any spaces or \n, and then be able to use them in subsequent calculations such as:
if a>threshold and a>b and a>c and a>d and a>e:
print ('a is highest and within limit')
CF=a
You can use strip(), but you need to use read() instead of readlines(). Another way, if you have more than one value in a row with comma separation, you can use the code as below:
with open('710_edited_capture.csv', 'r') as file:
file_content=file.readlines()
for line in file_content:
vals = line.strip().split(',')
print(vals)
You can also append "vals" to an empty list. As a result, you will get a list that contains a list of values for each line.
it's a little bit unclear what you want to do but if you just want to read a file compare each value to a threshold value and keep upper value here a example :
threshold=46.2
outlist=[]
with open('data.csv', 'r') as data:
for i in data:
if float(i)>threshold:
outlist.append(i)
then you can adapt it to your needs...
Thanks for all the comments and suggestions however they are not quite what I needed.
I have however applied a workaround, although admittedly clunky.
I have created 5 additional files from the original with only the one value in each. From this I can now strip the space and /n and save them locally as a variable. I no longer needed the readlines
These variables can be compared to each other and the threshold to determine the optimum choice.

Can list become an element of set in python

I really wonder why the second one gives an error:
It would be really great if some one could pls highlight can we use lists as an element in set or it's not allowed to have any mutable object inside a set.
1)
>>> x = set(["Perl", "Python", "Java"])
>>> x
set(['Python', 'Java', 'Perl'])
>>>
2)
>>> cities = set((["Python","Perl"], ["Paris", "Berlin", "London"]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>>
As you noted, you can't have a list as a member of a set (because it's not hashable).
I think you've been confused by the repr of the set in your first example. The output set(['Python', 'Java', 'Perl']) doesn't indicate that the set contains a 3-element list. Rather, it contains the three strings, with the list just being part of the notation the repr uses (since the set constructor expects an iterable of items). Note that the order of the items changes from your input to the the arbitrary order of the output!
In Python 3, the set type's repr uses set-literal syntax instead:
>>> x = set(["Perl", "Python", "Java"])
>>> x
{'Java', 'Perl', 'Python'}

Python: Extracting floats from files in a complex directory tree - Are loops the answer?

I have just started doing my first research project, and I have just begun programming (approximately 2 weeks ago). Excuse me if my questions are naive. I might be using python very inefficiently. I am eager to improve here.
I have experimental data that I want to analyse. My goal is to create a python script that takes the data as input, and that for output gives me graphs, where certain parameters contained in text files (within the experimental data folders) are plotted and fitted to certain equations. This script should be as generalizable as possible so that I can use it for other experiments.
I'm using the Anaconda, Python 2.7, package, which means I have access to various libraries/modules related to science and mathematics.
I am stuck at trying to use For and While loops (for the first time).
The data files are structured like this (I am using regex brackets here):
.../data/B_foo[1-7]/[1-6]/D_foo/E_foo/text.txt
What I want to do is to cycle through all the 7 top directories and each of their 6 subdirectories (named 1,2,3...6). Furthermore, within these 6 subdirectories, a text file can be found (always with the same filename, text.txt), which contain the data I want to access.
The 'text.txt' files is structured something like this:
1 91.146 4.571 0.064 1.393 939.134 14.765
2 88.171 5.760 0.454 0.029 25227.999 137.883
3 88.231 4.919 0.232 0.026 34994.013 247.058
4 ... ... ... ... ... ...
The table continues down. Every other row is empty. I want to extract information from 13 rows starting from the 8th line, and I'm only interested in the 2nd, 3rd and 5th columns. I want to put them into lists 'parameter_a' and 'parameter_b' and 'parameter_c', respectively. I want to do this from each of these 'text.txt' files (of which there is a total of 7*6 = 42), and append them to three large lists (each with a total of 7*6*13 = 546 items when everything is done).
This is my attempt:
First, I made a list, 'list_B_foo', containing the seven different 'B_foo' directories (this part of the script is not shown). Then I made this:
parameter_a = []
parameter_b = []
parameter_c = []
j = 7 # The script starts reading 'text.txt' after the j:th line.
k = 35 # The script stops reading 'text.txt' after the k:th line.
x = 0
while x < 7:
for i in range(1, 7):
path = str(list_B_foo[x]) + '/%s/D_foo/E_foo/text.txt' % i
m = open(path, 'r')
line = m.readlines()
while j < k:
line = line[j]
info = line.split()
print 'info:', info
parameter_a.append(float(info[1]))
parameter_b.append(float(info[2]))
parameter_c.append(float(info[5]))
j = j + 2
x = x + 1
parameter_a_vect = np.array(parameter_a)
parameter_b_vect = np.array(parameter_b)
parameter_c_vect = np.array(parameter_c)
print 'a_vect:', parameter_a_vect
print 'b_vect:', parameter_b_vect
print 'c_vect:', parameter_c_vect
I have tried to fiddle around with indentation without getting it to work (receiving either syntax error or indentation errors). Currently, I get this output:
info: ['1', '90.647', '4.349', '0.252', '0.033', '93067.188', '196.142']
info: ['.']
Traceback (most recent call last):
File "script.py", line 104, in <module>
parameter_a.append(float(info[1]))
IndexError: list index out of range
I don't understand why I get the "list index out of range" message. If anyone knows why this is the case, I would be happy to hear you out.
How do I solve this problem? Is my approach completely wrong?
EDIT: I went for a pure while-loop solution, taking RebelWithoutAPulse and CamJohnson26's suggestions into account. This is how I solved it:
parameter_a=[]
parameter_b=[]
parameter_c=[]
k=35 # The script stops reading 'text.txt' after the k:th line.
x=0
while x < 7:
y=1
while y < 7:
j=7
path1 = str(list_B_foo[x]) + '/%s/pdata/999/dcon2dpeaks.txt' % (y)
m = open(path, 'r')
lines = m.readlines()
while j < k:
line = lines[j]
info = line.split()
parameter_a.append(float(info[1]))
parameter_b.append(float(info[2]))
parameter_c.append(float(info[5]))
j = j+2
y = y+1
x = x+1
Meta: I am not sure If I should give the answer to the person who answered the quickest and who helped me finish my task. Or the person with the answer which I learned most from. I am sure this is a common issue that I can find an answer to by reading the rules or going to Stackexchange Meta. Until I've read up on the recomendations, I will hold off on marking the question as answered by any of you two.
Welcome to stack overflow!
The error is due to name collision that you inadvertenly have created. Note the output before the exception occurs:
info: ['1', '90.647', '4.349', '0.252', '0.033', '93067.188', '196.142']
info: ['.']
Traceback (most recent call last):
...
The line[1] cannot compute - there is no "1"-st element in the list, containing only '.' - in python the lists start with 0 position.
This happens in your nested loop,
while j < k
where you redefine the very line you read previously created:
line = m.readlines()
while j < k:
line = line[j]
info = line.split()
...
So what happens is on first run of the loop, your read the lines of the files into line list, then you take one line from the list, assign it to line again, and continue with the loop. At this point line contains a string.
On the next run reading from line via specified index reads the character from the string on the j-th position and the code malfunctions.
You could fix this with different naming.
P.S. I would suggest using with ... as ... syntax while working with files, it is briefly described here - this is called a context manager and it takes care of opening and closing the files for you.
P.P.S. I would also suggest reading the naming conventions
Looks like you are overwriting the line array with the first line of the file. You call line = m.readlines(), which sets line equal to an array of lines. You then set line = line[j], so now the line variable is no longer an array, it's a string equal to
1 91.146 4.571 0.064 1.393 939.134 14.765
This loop works fine, but the next loop will treat line as an array of chars and take the 4th element, which is just a period, and set it equal to itself. That explains why the info variable only has one element on the second pass through the loop.
To solve this, just use 2 line variables instead of one. Call one lines and the other line.
lines = m.readlines()
while j < k:
line = lines[j]
info = line.split()
May be other errors too but that should get you started.

Python "for i in" + variable

I have the following code:
#Euler Problem 1
print "We are going to solve Project Euler's Problem #1"
euler_number = input('What number do you want to sum up all the multiples?')
a = input('Insert the 1st multiple here: ')
b = input('Insert the 2nd multiple here: ')
total = 0
for i in euler_number:
if i%a == 0 or i%b == 0:
total += i
print "Sum of all natural numbers below 'euler_number' that are multiples of 'a'"
print "or 'b' is: ", total
With the following error:
Traceback (most recent call last):
File "euler_1.py", line 10, in <module>
for i in euler_number:
TypeError: 'int' object is not iterable
I tried to search for "for i in" + "variable", and other sorts, but could not find anything...
I have two questions:
What would you have suggested that I search for?
How can I solve this so that I can look for the sum of two multiples for any number?
Any help would be great.
You probably want this:
for i in range(1, euler_number + 1):
In Python, a for loop loops using a series of values. These can be, for example, items in a list; or these can be values that come from an "iterator".
for i in 3 makes no sense in Python, as 3 is not a series of values.
To loop over a series of integers, use range() or xrange().
How does the Python's range function work?
Here's the documentation for Python's for syntax:
http://docs.python.org/2/reference/compound_stmts.html#for
It loops over any sequence of values, not just a sequence of numbers like in other languages.
Have fun learning Python, it's a great language. Also, the Project Euler challenges are fun, so stick with them, and don't give up!

Why do I get a TypeError here?

if emp in like_list[j]:
TypeError: coercing to Unicode: need string or buffer, list found
Both emp and like_list are lists containing strings.
Because both emp and like_list are lists, you are essentially looking for a list within a list.
If you're trying to match any element within list emp, you can iterate over the list like this:
for element in emp:
if element in like_list:
--do something--
else:
--do something else--
Alternatively, if like_list were a list of lists, your if statement would work.
If both emp and like_list are lists of strings, the expression emp in like_list[j] is checking if a list is a member of a single string. When I tested it out with the code below I got a slightly different TypeError:
>>> emp = ["foo", "bar"]
>>> like_list = ["baz", "quux"]
>>> if emp in like_list[0]:
... print "found"
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'in <string>' requires string as left operand, not list
This says that you can't test non-strings for membership in a string. I think fixing this will be pretty easy, but it's not entirely clear what you were trying to do.
If you want to check if the string like_list[j] has one of the strings in emp as a substring, use:
if any(s in like_list[j] for s in emp):
If instead you want to see if like_list[j] is equal to one of the strings in emp, you need to turn around the in expression:
if like_list[j] in emp: