Python strip() and readlines() - python-2.7

I have a code that I am trying to run which will compare a value from a csv file to a threshold that I have set within the py file.
My csv file has an output similar to below, but with 1030 lines
-46.62
-47.42
-47.36
-47.27
-47.36
-47.24
-47.24
-47.03
-47.12
Note: there are no lines between the values but there is a single space before them.
My first attempt was with this code:
file_in5 = open('710_edited_capture.csv', 'r')
line5=file_in5.readlines()
a=line5[102]
b=line5[307]
c=line5[512]
d=line5[717]
e=line5[922]
print[a]
print[b]
print[c]
print[d]
print[e]
which gave the output of:
[' -44.94\n']
[' -45.06\n']
[' -45.09\n']
[' -45.63\n']
[' -45.92\n']
My first thought was to use .strip() to remove the space and the \n but this is not supported in lists and returns the error:
Traceback (most recent call last):
File "/root/test.py", line 101, in <module>
line5=line5.strip()
AttributeError: 'list' object has no attribute 'strip'
My next code below:
for line5 in file_in5:
line5=line5.strip()
line5=file_in5.readlines()
a=line5[102]
b=line5[307]
c=line5[512]
d=line5[717]
e=line5[922]
print[a]
print[b]
print[c]
print[d]
print[e]
Returns another error:
Traceback (most recent call last):
File "/root/test.py", line 91, in <module>
line5=file_in5.readlines()
ValueError: Mixing iteration and read methods would lose data
What is the most efficient way to read in just 5 specific lines without any spaces or \n, and then be able to use them in subsequent calculations such as:
if a>threshold and a>b and a>c and a>d and a>e:
print ('a is highest and within limit')
CF=a

You can use strip(), but you need to use read() instead of readlines(). Another way, if you have more than one value in a row with comma separation, you can use the code as below:
with open('710_edited_capture.csv', 'r') as file:
file_content=file.readlines()
for line in file_content:
vals = line.strip().split(',')
print(vals)
You can also append "vals" to an empty list. As a result, you will get a list that contains a list of values for each line.

it's a little bit unclear what you want to do but if you just want to read a file compare each value to a threshold value and keep upper value here a example :
threshold=46.2
outlist=[]
with open('data.csv', 'r') as data:
for i in data:
if float(i)>threshold:
outlist.append(i)
then you can adapt it to your needs...

Thanks for all the comments and suggestions however they are not quite what I needed.
I have however applied a workaround, although admittedly clunky.
I have created 5 additional files from the original with only the one value in each. From this I can now strip the space and /n and save them locally as a variable. I no longer needed the readlines
These variables can be compared to each other and the threshold to determine the optimum choice.

Related

Convert a single list item of key value pair to an dictionary in python

I have function that returns just one list of key-value pair. How can I convert this to an actual key value or an object type so I can get each attribute from the list. For example I would like to be able to just get the time or price or any other property and not the whole list as one item.
{'time': 1512858529643, 'price': '0.00524096', 'origQty': '530.00000000'
I know it doesn't look like a list but it actually is! The function that I am calling returns this as a list. I am simply storing it to a variable and nothign else.
open_order=client.get_open_orders(symbol="BNBETH",recvWindow=1234567)
If you still have doubts. When I try to print a dictionary item like this print(open_order['time'])
I get the following error.
Traceback (most recent call last):
File "C:\Python27\python-binance-master\main.py", line 63, in <module>
print(open_order['time'])
TypeError: list indices must be integers, not str
Also If I show type it shows as list.
print(type(open_order))
So, I was able to come up with a solution, sort of... by converting the list to string and splitting at the "," character. Now I have list of items that I can actually print by selecting one print(split_order_items[5]) There has to be a better solution.
open_order=client.get_open_orders(symbol="BNBETH",recvWindow=1234567)
y=''.join(str(e)for e in open_order)
split_order_items =([x.strip() for x in y.split(',')])
print(split_order_items[5])
I was able to create a multiple list items using the above code. I just can't seem to convert it to dictionary object!
Thanks!
What you have posted is a dict, not a list. You can do something like this:
data = {'time': 1512858529643, 'price': '0.00524096', 'orderId': 7848174, 'origQty': '530.00000000'}
print(data['time']) # this gets just the time and prints it
print(data['price']) # this gets just the price and prints it
I strongly suggest reading up on the Python dict: https://docs.python.org/3/tutorial/datastructures.html#dictionaries

Hello I have a code that prints what I need in python but i'd like it to write that result to a new file

The file look like a series of lines with IDs:
aaaa
aass
asdd
adfg
aaaa
I'd like to get in a new file the ID and its occurrence in the old file as the form:
aaaa 2
asdd 1
aass 1
adfg 1
With the 2 element separated by tab.
The code i have print what i want but doesn't write in a new file:
with open("Only1ID.txt", "r") as file:
file = [item.lower().replace("\n", "") for item in file.readlines()]
for item in sorted(set(file)):
print item.title(), file.count(item)
As you use Python 2, the simplest approach to convert your console output to file output is by using the print chevron (>>) syntax which redirects the output to any file-like object:
with open("filename", "w") as f: # open a file in write mode
print >> f, "some data" # print 'into the file'
Your code could look like this after simply adding another open to open the output file and adding the chevron to your print statement:
with open("Only1ID.txt", "r") as file, open("output.txt", "w") as out_file:
file = [item.lower().replace("\n", "") for item in file.readlines()]
for item in sorted(set(file)):
print >> out_file item.title(), file.count(item)
However, your code has a few other more or less bad things which one should not do or could improve:
Do not use the same variable name file for both the file object returned by open and your processed list of strings. This is confusing, just use two different names.
You can directly iterate over the file object, which works like a generator that returns the file's lines as strings. Generators process requests for the next element just in time, that means it does not first load the whole file into your memory like file.readlines() and processes them afterwards, but only reads and stores one line at a time, whenever the next line is needed. That way you improve the code's performance and resource efficiency.
If you write a list comprehension, but you don't need its result necessarily as list because you simply want to iterate over it using a for loop, it's more efficient to use a generator expression (same effect as the file object's line generator described above). The only syntactical difference between a list comprehension and a generator expression are the brackets. Replace [...] with (...) and you have a generator. The only downside of a generator is that you neither can find out its length, nor can you access items directly using an index. As you don't need any of these features, the generator is fine here.
There is a simpler way to remove trailing newline characters from a line: line.rstrip() removes all trailing whitespaces. If you want to keep e.g. spaces, but only want the newline to be removed, pass that character as argument: line.rstrip("\n").
However, it could possibly be even easier and faster to just not add another implicit line break during the print call instead of removing it first to have it re-added later. You would suppress the line break of print in Python 2 by simply adding a comma at the end of the statement:
print >> out_file item.title(), file.count(item),
There is a type Counter to count occurrences of elements in a collection, which is faster and easier than writing it yourself, because you don't need the additional count() call for every element. The Counter behaves mostly like a dictionary with your items as keys and their count as values. Simply import it from the collections module and use it like this:
from collections import Counter
c = Counter(lines)
for item in c:
print item, c[item]
With all those suggestions (except the one not to remove the line breaks) applied and the variables renamed to something more clear, the optimized code looks like this:
from collections import Counter
with open("Only1ID.txt") as in_file, open("output.txt", "w") as out_file:
counter = Counter(line.lower().rstrip("\n") for line in in_file)
for item in sorted(counter):
print >> out_file item.title(), counter[item]

Python 2.7.10 ValueError: Incomplete Format if statement

def func():
import csv
file=open("cmiday.csv")
x,y=[],[]
reader=csv.DictReader(file)
for row in reader:
if(type(row["max_rel_hum"])%1==0):
continue
if(type(row["precip"])%1==0):
continue
if(row["max_rel_hum"]>100):
continue
if(row["max_rel_hum"]<0):
continue
if (row["precip"]>10):
continue
if(row["precip"]<0):
continue
x.append(row["max_rel_hum"])
y.append(row["precip"])
print x
print y
I'm trying to collect data from a csv file into lists x and y. I don't want any values for row["max_rel_hum"] to be integers or be more than 100 or less than 0. Similarly, I don't want any values for row["precip"] to be more than 10 or less than 0. I'm getting this error when I try to run the function:
>>> func()
Traceback (most recent call las
File "<stdin>", line 1, in <m
File "hw.py", line 7, in func
if(row["max_rel_hum"]%1==0)
ValueError: incomplete format
Please help out. Thanks
Values from a CSV are strings, not integers. You're expecting % to do modulo, but on a string it does string formatting.
You need something like this:
if ( int(row["max_rel_hum"]) % 1 == 0):
And you need to do int() for in all the lines, even the < and > ones - they are valid operations on strings, but will do an alphabetical order comparison, not a numeric comparison, and won't give the results you expect.
You don't need type() in the if line at all.

What do the ">>" symbols mean in Python code: map(chr,[x,x>>8,y])

The error code I get in another file that uses it is:
Traceback (most recent call last):
File "C:\Anaconda\lib\site-packages\pyahoolib-0.2-py2.7.egg\yahoo\session.py", line 107, in listener
t.send_pk(consts.SERVICE_AUTHRESP, auth.hash(t.login_id, t.passwd, p[94]))
File "C:\Anaconda\lib\site-packages\pyahoolib-0.2-py2.7.egg\yahoo\auth.py", line 73, in hash
hs = md5.new(mkeystr+"".join(map(chr,[x,x>>8,y]))).digest()
ValueError: chr() arg not in range(256)
UPDATE: #merlin2011: This is confusing me. the code is hs = md5.new(mkeystr+"".join(map(chr,[x,x>>8,y]))).digest()
Where the chr has a comma after it. I thought it was a function from doc.python.org: chr(i)
Return a string of one character whose ASCII code is the integer i. For example, chr(97) returns the string 'a'. This is the inverse of ord(). The argument must be in the range [0..255], inclusive; ValueError will be raised if i is outside that range. See also unichr().
If so, is [x,x>>8,y] an iterable for map() I just don't recognize yet?
Also, I don't want to change any of this code because it is part of the pyahoolib-0.2 auth.py file. But to get it all working I do not know what to do.
It's the Binary Right Shift Operator:
From Python Wiki:
x >> y:
Returns x with the bits shifted to the right by y places. This is the same as integer-dividing (\\) x by 2**y.
In case you were wondering, the error message means that chr only accepts arguments inside the range 0 to 256, and your map function is causing it to be called with a value that is outside that range.

For loop using enumerate through a list with an if statement to search lines for a particular string

I am going to compile a list of a recurring strings (transaction ID).
I am flummoxed. I've researched the correct method and feel like this code should work.
However, I'm doing something wrong in the second block.
This first block correctly compiles a list of the strings that I want.
I cant get this second block to work. If I simplify, I can print each value in the list
by using
for idx, val in enumerate(tidarray): print val
It seems like I should now be able to use that value to search each line for that string,
then print the line (actually I'll be using it in conjunction with another search term to
reduce the number of line reads, but this is my basic test before honing in further.
def main():
pass
samlfile= "2013-08-18 06:24:27,410 tid:5af193fdc DEBUG org.sourceid.saml20.domain.AttributeMapping] Source attributes:{SAML_AUTHN_CTX=urn:oasis:names:tc:SAML:2.0:ac:classes"
tidarray = []
for line in samlfile:
if "tid:" in line:
str=line
tid = re.search(r'(tid:.*?)(?= )', str)
if tid.group() not in tidarray:
tidarray.append(tid.group())
for line in samlfile:
for idx, val in enumerate(tidarray):
if val in line:
print line
Can someone suggest a correction for the second block of code? I recognize that reading the file twice isn't the most elegant solution... My main goal here is to learn how to enumerate through the list and use each value in the subsequent code.
Iterating over a file twice
Basically what you do is:
for line in somefile: pass # first run
for line in somefile: pass # second run
The first run will complete just fine, the second run will not run at all.
This is because the file was read until the end and there's no more data to read lines from.
Call somefile.seek(0) to go to the beginning of the file:
for line in somefile: pass # first run
somefile.seek(0)
for line in somefile: pass # second run
Storing things uniquely
Basically, what you seem to want is a way to store the IDs from the file in the a
data structure and every id shall only be once in said structure.
If you want to store elements uniquely you use, for example, dictionaries (help(dict))
or sets (help(set)). Example with sets:
myset = set()
myset.add(2) # set([2])
myset.add(3) # set([2,3])
myset.add(2) # set([2,3])