error with extracting a list from a file - list

In Python 2.7, I used this code to write a list of floats into a file:
with open(filename, "a") as file:
writer = csv.writer(file)
writer.writerow(GGDr_L)
and to get this list, I used
for numberString in line.split(","):
number = np.float64(numberString)
RL.append(number)
it works very well in python 2.7 but with python 3.6, i got an error related to the latest float in the list, I got
number = np.float64(numberString)
ValueError: could not convert string to float:
when I tried to print this last float in the list, it was writed correctly, but with empty line after the float!! could this be the reason of the error??

Related

NameError: name 'Right' is not defined

I'm trying to execute the following code in Spyder 3.3.1 using Python 2.7.15. I'm a beginner.
text = str(input("You are lost in forest..."))
while text == "Right":
text = str(input("You are lost in forest..."))
print "You got out of the forest!!!"
When I run the code with integer value it works. For example the following piece of code:
text = input("You are lost in forest...")
while text == 1:
text = input("You are lost in forest...")
print "You got out of the forest!!!"
How can I make the input() work with a string value? Thank you for your help.
Use raw_input() instead of input():
value = raw_input("You are lost in forest...") # String input
value = int(raw_input("You are lost in forest...")) # Integer input
...
In python 2, raw_input() takes exactly what the user typed and passes it back as a string. input() tries to understand the data entered by the user.Hence expects a syntactically correct python statement.That's why you got an error when you enter Right. So you can enter "Right" as input to fix this error.
But it is better to use raw_input() instead of input().

Python iterate through txt file of dates and change date format

I have a txt file oldDates.txt. I want to loop through it, modify the date formats and write the newly formatted dates to a new txt file. My code so far:
from datetime import datetime
f = open('oldDates.txt', r)
oldDates = []
newDates = []
for line in f.readlines():
oldDates.append(line)
print(line) # for testing
for oldDate in oldDates:
dt = datetime.strptime(oldDate, '%d/%m/%Y').strftime('%d/%m/%Y')
newDates.append(dt)
with open('newDates.txt', 'w') as w:
for newDate in newDates:
w.write(newDate+"\n")
f.close()
w.close()
However, this give an error:
ValueError: unconverted data remains
I'm not sure where I'm going wrong here, and if there's a more efficient way of doing this then I'd be glad to hear about it. The date conversion seems to work fine from the test print.
There are blank lines in the file and I'm wondering if I need to handle these (I'm not sure how).
Any help much appreciated!
Now, I am no expert in Python, but do you verify your input to be the correct format ? If you apply a regexp on the input line you can easily catch blank lines and any incorrect values.

Python Decoding and Encoding, List Element utf-8

just another question about encoding in python i think. I have this programm:
regex = re.compile(ur'\b[sw]\w+', flags= re.U | re.I)
ergebnisliste = []
for line in fileobject:
print str(line)
erg = regex.findall(line)
ergebnisliste = ergebnisliste + erg
ergebnislistesortiert = sorted(ergebnisliste, key=lambda x: len(x))
print ergebnislistesortiert
fileobject.close()
I am searching a textfile for words beginning with s or w. My "ergebnislistesortiert" is the sorted result list.
I will print the result list and there appers to be a problem with the encoding:
['so', 'Wer', 'sp\xc3']
the 'sp\xc3' should be print as spät. What is wrong here? Why is the list element utf-8?
And how can i get the right decoding to print "spät"?
Thanks a lot guys!
\xc3 is not UTF-8. It's a fragment of the full UTF-8 encoding of U+00E4 but you're probably reading it with something like a Latin-1 decoder (which is effectively what Python 2 does if you read bytes without specifying an encoding), in which case the second byte in the UTF-8 sequence isn't matched by \w.
The real fix is to decode the data when you are reading it into Python in the first place. If you are writing new code, switching to Python 3 is probably the best and easiest fix.
If you're stuck on Python 2.7, a somewhat Python 3-compatible approach is something like
import io
fileobject = io.open(filename, encoding='utf-8')
If you have control over the input file and want to postpone the proper solution until you are older, (ask your parents for permission to) convert the UTF-8 input file to some legacy 8-bit encoding.

Why must I run this code a few times before my entire .csv file is converted into a .yaml file?

I am trying to build a tool that can convert .csv files into .yaml files for further use. I found a handy bit of code that does the job nicely from the link below:
Convert CSV to YAML, with Unicode?
which states that the line will take the dict created by opening a .csv file and dump it to a .yaml file:
out_file.write(ry.safe_dump(dict_example,allow_unicode=True))
However, one small kink I have noticed is that when it is run once, the generated .yaml file is typically incomplete by a line or two. In order to have the .csv file exhaustively read through to create a complete .yaml file, the code must be run two or even three times. Does anybody know why this could be?
UPDATE
Per request, here is the code I use to parse my .csv file, which is two columns long (with a string in the first column and a list of two strings in the second column), and will typically be 50 rows long (or maybe more). Also note that it designed to remove any '\n' or spaces that could potentially cause problems later on in the code.
csv_contents={}
with open("example1.csv", "rU") as csvfile:
green= csv.reader(csvfile, dialect= 'excel')
for line in green:
candidate_number= line[0]
first_sequence= line[1].replace(' ','').replace('\r','').replace('\n','')
second_sequence= line[2].replace(' ','').replace('\r','').replace('\n','')
csv_contents[candidate_number]= [first_sequence, second_sequence]
csv_contents.pop('Header name', None)
Ultimately, it is not that important that I maintain the order of the rows from the original dict, just that all the information within the rows is properly structured.
I am not sure what would cause could be but you might be running out of memory as you create the YAML document in memory first and then write it out. It is much better to directly stream it out.
You should also note that the code in the question you link to, doesn't preserve the order of the original columns, something easily circumvented by using round_trip_dump instead of safe_dump.
You probably want to make a top-level sequence (list) as in the desired output of the linked question, with each element being a mapping (dict).
The following parses the CSV, taking the first line as keys for mappings created for each following line:
import sys
import csv
import ruamel.yaml as ry
import dateutil.parser # pip install python-dateutil
def process_line(line):
"""convert lines, trying, int, float, date"""
ret_val = []
for elem in line:
try:
res = int(elem)
ret_val.append(res)
continue
except ValueError:
pass
try:
res = float(elem)
ret_val.append(res)
continue
except ValueError:
pass
try:
res = dateutil.parser.parse(elem)
ret_val.append(res)
continue
except ValueError:
pass
ret_val.append(elem.strip())
return ret_val
csv_file_name = 'xyz.csv'
data = []
header = None
with open(csv_file_name) as inf:
for line in csv.reader(inf):
d = process_line(line)
if header is None:
header = d
continue
data.append(ry.comments.CommentedMap(zip(header, d)))
ry.round_trip_dump(data, sys.stdout, allow_unicode=True)
with input xyz.csv:
id, title_english, title_russian
1, A Title in English, Название на русском
2, Another Title, Другой Название
this generates:
- id: 1
title_english: A Title in English
title_russian: Название на русском
- id: 2
title_english: Another Title
title_russian: Другой Название
The process_line is just some sugar that tries to convert strings in the CSV file to more useful types and strings without leading spaces (resulting in far less quotes in your output YAML file).
I have tested the above on files with 1000 rows, without any problems (I won't post the output though).
The above was done using Python 3 as well as Python 2.7, starting with a UTF-8 encoded file xyz.csv. If you are using Python 2, you can try unicodecsv if you need to handle Unicode input and things don't work out as well as they did for me.

python insert to postgres over psycopg2 unicode characters

Hi guys I am having a problem with inserting utf-8 unicode character to my database.
The unicode that I get from my form is u'AJDUK MARKO\u010d'. Next step is to decode it to utf-8. value.encode('utf-8') then I get a string 'AJDUK MARKO\xc4\x8d'.
When I try to update the database, works the same for insert btw.
cur.execute( "UPDATE res_partner set %s = '%s' where id = %s;"%(columns, value, remote_partner_id))
The value gets inserted or updated to the database but the problem is it is exactly in the same format as AJDUK MARKO\xc4\x8d and of course I want AJDUK MARKOČ. Database has utf-8 encoding so it is not that.
What am I doing wrong? Surprisingly couldn't really find anything useful on the forums.
\xc4\x8d is the UTF-8 encoding representation of Č. It looks like the insert has worked but you're not printing the result correctly, probably by printing the whole row as a list. I.e.
>>> print "Č"
"Č"
>>> print ["Č"] # a list with one string
['\xc4\x8c']
We need to see more code to validate (It's always a good idea to give as much reproducible code as possible).
You could decode the result (result.decode("utf-8")) but you should avoid manually encoding or decoding. Psycopg2 already allows you send Unicodes, so you can do the following without encoding first:
cur.execute( u"UPDATE res_partner set %s = '%s' where id = %s;" % (columns, value, remote_partner_id))
- note the leading u
Psycopg2 can return Unicodes too by having strings automatically decoded:
import psycopg2
import psycopg2.extensions
psycopg2.extensions.register_type(psycopg2.extensions.UNICODE)
psycopg2.extensions.register_type(psycopg2.extensions.UNICODEARRAY)
Edit:
SQL values should be passed as an argument to .execute(). See the big red box at: http://initd.org/psycopg/docs/usage.html#the-problem-with-the-query-parameters
Instead
E.g.
# Replace the columns field first.
# Strictly we should use http://initd.org/psycopg/docs/sql.html#module-psycopg2.sql
sql = u"UPDATE res_partner set {} = %s where id = %s;".format(columns)
cur.execute(sql, (value, remote_partner_id))