python 2 not recognize "newline" for file stream - python-2.7

With Python 3.3, the following code works fine
import csv
with open(foname, "w", newline='') as outstream:
csv.writer(outstream, delimiter =' ').writerows(
[cell.value for cell in row]
for row in ws.rows
)
However, python-2 is unable to run that and says
with open(foname, "w", newline='') as outstream:
TypeError: 'newline' is an invalid keyword argument for this function
What is the equivalent for previous versions?

Use with open(foname, 'wb') as outstream:. newline was a parameter added in Python 3.
This is documented for Python 2 as:
If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.
Whereas for Python 3, the documentation says:
If csvfile is a file object, it should be opened with newline=''

Related

PYTHON2 - Is there ways to declare encoding method in csv write

When I try to write cursor result coming from database execution (type is a list) to the csv, the error throws
a.writerow(lst) TypeError: write() argument 1 must be unicode, not str
This is for python2. I've tried in python3 like the script below. But the system requirement asks me to change Python2.
This is the correct script using python3.
results_percent = cursor.fetchall()
with open(file4,'w',encoding="utf-8",newline='') as fp:
a = csv.writer(fp, delimiter=',')
a.writerow(['MFIName','ClientCountAtSignUp','UploadCountLastMonth','UploadCount','80%','Status'])
a.writerows(results_percent)
The below is by using python2 which gives me error a.writerow(lst) TypeError: write() argument 1 must be unicode, not str
results_percent = cursor.fetchall()
with io.open(file4,'w',encoding='utf-8') as fp:
a = csv.writer(fp, delimiter=',')
lst = ['MFIName','ClientCountAtSignUp','UploadCountLastMonth','UploadCount','80%','Status']
a.writerow(lst)
a.writerows(results_percent)
The output is to write results_percent to csv file.

Django encoding error when reading from a CSV

When I try to run:
import csv
with open('data.csv', 'rU') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
pgd = Player.objects.get_or_create(
player_name=row['Player'],
team=row['Team'],
position=row['Position']
)
Most of my data gets created in the database, except for one particular row. When my script reaches the row, I receive the error:
ProgrammingError: You must not use 8-bit bytestrings unless you use a
text_factory that can interpret 8-bit bytestrings (like text_factory = str).
It is highly recommended that you instead just switch your application to Unicode strings.`
The particular row in the CSV that causes this error is:
>>> row
{'FR\xed\x8aD\xed\x8aRIC.ST-DENIS', 'BOS', 'G'}
I've looked at the other similar Stackoverflow threads with the same or similar issues, but most aren't specific to using Sqlite with Django. Any advice?
If it matters, I'm running the script by going into the Django shell by calling python manage.py shell, and copy-pasting it in, as opposed to just calling the script from the command line.
This is the stacktrace I get:
Traceback (most recent call last):
File "<console>", line 4, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 108, in next
row = self.reader.next()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 302, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xcc in position 1674: invalid continuation byte
EDIT: I decided to just manually import this entry into my database, rather than try to read it from my CSV, based on Alastair McCormack's feedback
Based on the output from your question, it looks like the person who made the CSV mojibaked it - it doesn't seem to represent FRÉDÉRIC.ST-DENIS. You can try using windows-1252 instead of utf-8 but I think you'll end up with FRíŠDíŠRIC.ST-DENIS in your database.
I suspect you're using Python 2 - open() returns str which are simply byte strings.
The error is telling you that you need to decode your text to Unicode string before use.
The simplest method is to decode each cell:
with open('data.csv', 'r') as csvfile: # 'U' means Universal line mode and is not necessary
reader = csv.DictReader(csvfile)
for row in reader:
pgd = Player.objects.get_or_create(
player_name=row['Player'].decode('utf-8),
team=row['Team'].decode('utf-8),
position=row['Position'].decode('utf-8)
)
That'll work but it's ugly add decodes everywhere and it won't work in Python 3. Python 3 improves things by opening files in text mode and returning Python 3 strings which are the equivalent of Unicode strings in Py2.
To get the same functionality in Python 2, use the io module. This gives you a open() method which has an encoding option. Annoyingly, the Python 2.x CSV module is broken with Unicode, so you need to install a backported version:
pip install backports.csv
To tidy your code and future proof it, do:
import io
from backports import csv
with io.open('data.csv', 'r', encoding='utf-8') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
# now every row is automatically decoded from UTF-8
pgd = Player.objects.get_or_create(
player_name=row['Player'],
team=row['Team'],
position=row['Position']
)
Encode Player name in utf-8 using .encode('utf-8') in player name
import csv
with open('data.csv', 'rU') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
pgd = Player.objects.get_or_create(
player_name=row['Player'].encode('utf-8'),
team=row['Team'],
position=row['Position']
)
In Django, decode with latin-1, csv.DictReader(io.StringIO(csv_file.read().decode('latin-1'))), it would devour all special characters and all comma exceptions you get in utf-8.

python3 convert str to bytes-like obj without use encode

I wrote a httpserver to serve html files for python2.7 and python3.5.
def do_GET(self):
...
#if resoure is api
data = json.dumps({'message':['thanks for your answer']})
#if resource is file name
with open(resource, 'rb') as f:
data = f.read()
self.send_response(response)
self.send_header('Access-Control-Allow-Origin', '*')
self.end_headers()
self.wfile.write(data) # this line raise TypeError: a bytes-like object is required, not 'str'
the code works in python2.7, but in python 3, it raised the above the error.
I could use bytearray(data, 'utf-8') to convert str to bytes, but the html is changed in web.
My question:
How to do to support python2 and python3 without use 2to3 tools and without change the file's encoding.
is there a better way to read a file and sent it content to client with the same way in python2 and python3 ?
thanks in advance.
You just have to open your file in binary mode, not in text mode:
with open(resource,"rb") as f:
data = f.read()
then, data is a bytes object in python 3, and a str in python 2, and it works for both versions.
As a positive side-effect, when this code hits a Windows box, it still works (else binary files like images are corrupt because of the endline termination conversion when opened in text mode).

Error: unknown dialect

I'm using the csv reader in the csv module to read a file in the format.
Filename, Foo, Label
Each record looks as follows.
file1.wav,"[ 1.92849546e+02 2.86156126e+00 -7.96250116e+00
7.29509485e+02 4.79000000e+02 5.51000000e+02]",1
I get the following error when reading the file.
set_ = csv.reader(open(foo), 'rb', delimiter = ',')
Error: unknown dialect
Also I am using python 2.7 on a windows machine.
You are using the csv.reader api wrong
As per the documentation the 2nd argument to csv.reader is dialect and "rb" does not make sense.
Instead you probably intend to do something on these lines:
with open(foo) as input :
reader = csv.reader(foo)
#etc

Gzip and Encode file in Python 2.7

with gzip.open(sys.argv[5] + ".json.gz", mode="w", encoding="utf-8") as outfile:
It throws:
TypeError: open() got an unexpected keyword argument 'encoding'
But the docs says it exists
https://docs.python.org/3/library/gzip.html
Update
How can i encode and zip the file in Python 2.7?
I tried now this:
(but it don't work)
with gzip.open(sys.argv[5] + ".json.gz", mode="w") as outfile:
outfile = io.TextIOWrapper(outfile, encoding="utf-8")
json.dump(fdata, outfile, indent=2, ensure_ascii=False)
TypeError: must be unicode, not str
What can i do?
Those are the Python 3 docs. The Python 2 version of gzip does not allow encoding= as a keyword argument to gzip.open().
Seems the question has been answered sufficiently, but for your peace of mind: Alternatively to ensure that Python2 uses utf-8 as standard perhaps try the following, as it then becomes unnecessary to specify an encoding:
import sys
reload(sys)
sys.setdefaultencoding('UTF8')