Django character encoding - django

I'm working on a new Django site, and, after migrating in a pile of data, have started running into a deeply frustrating DjangoUnicodeDecodeError. The bad character in question is a \xe8 (e-grave).
There's two sticky issues:
It only happens in the production server, running an Apache-fronted fcgi process (running the same code with the same database on the Django dev server has no issues)
The stack trace in question is entirely within Django code. It occurs in the admin site (elsewhere too) when retrieving an item to display, though the field that contains the bad character is not actually ever rendered.
I'm not even entirely sure where to begin debugging this, short of trying to remove the offending characters manually. My guess is that it's a configuration issue, since it's environment-specific, but I'm not sure where to start there either.
EDIT:
As Daniel Roseman pointed out, the error is almost certainly in the unicode method--or, more precisely, another method that it calls. Note that the offending characters are in a field not referenced at all in the code here. I suppose that the exception is raised in a method that builds the object from the db result--if the queryset is never evaluated (e.g. if not self.enabled) there's no error. Here's the code:
def get_blocking_events(self):
return Event.objects.filter(<get a set of events>)
def get_blocking_reason(self):
blockers = self.get_blocking_events()
label = u''
if not self.enabled:
label = u'Sponsor disabled'
elif len(blockers) > 0:
label = u'Pending follow-up: "{0}" ({1})'.format(blockers[0],blockers[0].creator.email)
if len(blockers) > 1:
label += u" and {0} other event".format(len(blockers)-1)
if len(blockers) > 2:
label += u"s"
return label
def __unicode__(self):
label = self.name
blocking_msg = self.get_blocking_reason()
if len(blocking_msg):
label += u" ({0})".format(blocking_msg)
return label
Here's the tail of the stack trace, for fun:
File "/opt/opt.LOCAL/Django-1.2.1/django/template/__init__.py", line 954, in render
dict = func(*args)
File "/opt/opt.LOCAL/Django-1.2.1/django/contrib/admin/templatetags/admin_list.py", line 209, in result_list
'results': list(results(cl))}
File "/opt/opt.LOCAL/Django-1.2.1/django/contrib/admin/templatetags/admin_list.py", line 201, in results
yield list(items_for_result(cl, res, None))
File "/opt/opt.LOCAL/Django-1.2.1/django/contrib/admin/templatetags/admin_list.py", line 138, in items_for_result
f, attr, value = lookup_field(field_name, result, cl.model_admin)
File "/opt/opt.LOCAL/Django-1.2.1/django/contrib/admin/util.py", line 270, in lookup_field
value = attr()
File "/opt/opt.LOCAL/Django-1.2.1/django/db/models/base.py", line 352, in __str__
return force_unicode(self).encode('utf-8')
File "/opt/opt.LOCAL/Django-1.2.1/django/utils/encoding.py", line 88, in force_unicode
raise DjangoUnicodeDecodeError(s, *e.args)
DjangoUnicodeDecodeError: 'utf8' codec can't decode bytes in position 956-958: invalid data. You passed in <Sponsor: [Bad Unicode data]> (<class 'SJP.alcohol.models.Sponsor'>)

The issue here is that in unicode you use the following line:
label += " ({0})".format(blocking_msg)
And unfortunately in python 2.x this is trying to format blocking_msg as an ascii string. What you meant to type was:
label += u" ({0})".format(blocking_msg)

Turns out this is likely due to the FreeTDS layer that connects to the SQL Server. While FreeTDS provides some support for automatically converting encodings, my setup is either misconfigured or otherwise not working quite right.
Rather than fighting this battle, I've migrated to MySQL for now.

Related

Batch SQL INSERT in Django App

I am following this example to batch insert records into a table but modifying it to fit my specific example as such
sql='INSERT INTO CypressApp_grammatrix (name, row_num, col_num, gram_amount) VALUES {}'.format(', '.join(['(%s, %s, %s, %s)']*len(gram_matrix)),)
#print sql
params=[]
for gram in gram_matrix:
col_num=1
for g in gram:
params.extend([(matrix_name, row_num, col_num, g)])
col_num += 1
row_num += 1
print params
with closing(connection.cursor()) as cursor:
cursor.execute(sql, params)
However, upon doing so, I receive this error
return cursor._last_executed.decode('utf-8')
File "/usr/local/lib/python2.7/dist-packages/django/db/backends/mysql/base.py", line 150, in __getattr__
return getattr(self.cursor, attr)
AttributeError: 'Cursor' object has no attribute '_last_executed'
I would like to know why I received this error and what I can do to fix it, although I feel the problem could be with this code that works with MySQL that I did not write
def last_executed_query(self, cursor, sql, params):
# With MySQLdb, cursor objects have an (undocumented) "_last_executed"
# attribute where the exact query sent to the database is saved.
# See MySQLdb/cursors.py in the source distribution.
return cursor._last_executed.decode('utf-8')
So I don't know if I simply have an old copy of MySQLdb or what, but the problem appear to be with cursors.py. The only spot in that file where you can find _last_executed is here
def _do_query(self, q):
db = self._get_db()
self._last_executed = q
db.query(q)
self._do_get_result()
return self.rowcount
However, the __init__ does not set up this variable as an instance attribute. It's missing completely. So I took the liberty of adding it myself and initializing it to some query string. I assumed any would do, so I just added
class BaseCursor(object):
"""A base for Cursor classes. Useful attributes:
description
A tuple of DB API 7-tuples describing the columns in
the last executed query; see PEP-249 for details.
description_flags
Tuple of column flags for last query, one entry per column
in the result set. Values correspond to those in
MySQLdb.constants.FLAG. See MySQL documentation (C API)
for more information. Non-standard extension.
arraysize
default number of rows fetchmany() will fetch
"""
from _mysql_exceptions import MySQLError, Warning, Error, InterfaceError, \
DatabaseError, DataError, OperationalError, IntegrityError, \
InternalError, ProgrammingError, NotSupportedError
def __init__(self, connection):
from weakref import ref
...
self._last_executed ="SELECT * FROM T"
...
Now the cursor object does have the attribute _last_executed and when this function
def last_executed_query(self, cursor, sql, params):
# With MySQLdb, cursor objects have an (undocumented) "_last_executed"
# attribute where the exact query sent to the database is saved.
# See MySQLdb/cursors.py in the source distribution.
return cursor._last_executed.decode('utf-8')
in base.py is called, the attribute does exist and so this error
return cursor._last_executed.decode('utf-8')
File "/usr/local/lib/python2.7/dist-
packages/django/db/backends/mysql/base.py", line 150, in __getattr__
return getattr(self.cursor, attr)
AttributeError: 'Cursor' object has no attribute '_last_executed'
will not be encountered. At least that is how I believe it works. In any case, it fixed the situation for me.

Error translating Tornado template with gettext

I've got this site running on top with Tornado and its template engine that I want to Internationalize, so I thought on using gettext to help me with that.
Since my site is already in Portuguese, my message.po (template) file has all msgid's in portuguese as well (example):
#: base.html:30 base.html:51
msgid "Início"
msgstr ""
It was generated with xgettext:
xgettext -i *.html -L Python --from-code UTF-8
Later I used Poedit to generate the translation file en_US.po and later compile it as en_US.mo.
Stored in my translation folder:
translation/en_US/LC_MESSAGES/site.mo
So far, so good.
I've created a really simple RequestHandler that would render and return the translated site.
import os
import logging
from tornado.web import RequestHandler
import tornado.locale as locale
LOG = logging.getLogger(__name__)
class SiteHandler(RequestHandler):
def initialize(self):
locale.load_gettext_translations(os.path.join(os.path.dirname(__file__), '../translations'), "site")
def get(self, page):
LOG.debug("PAGE REQUESTED: %s", page)
self.render("site/%s.html" %page)
As far as I know that should work perfectly, but somehow I've encountered some issues:
1 - How do I tell Tornado that my template has its text in Portuguese so it won't go looking for a pt locale which I don't have?
2 - When asking for the site with en_US locale, it loads ok but when Tornado is going to translate, it throws an encoding exception.
TypeError: not all arguments converted during string formatting
ERROR:views.site:Could not load template
Traceback (most recent call last):
File "/Users/ademarizu/Dev/git/new_plugin/site/src/main/py/views/site.py", line 20, in get
self.render("site/%s.html" %page)
File "/Users/ademarizu/Dev/virtualEnvs/execute/lib/python2.7/site-packages/tornado/web.py", line 664, in render
html = self.render_string(template_name, **kwargs)
File "/Users/ademarizu/Dev/virtualEnvs/execute/lib/python2.7/site-packages/tornado/web.py", line 771, in render_string
return t.generate(**namespace)
File "/Users/ademarizu/Dev/virtualEnvs/execute/lib/python2.7/site-packages/tornado/template.py", line 278, in generate
return execute()
File "site/home_html.generated.py", line 11, in _tt_execute
_tt_tmp = _("Início") # site/base.html:30
File "/Users/ademarizu/Dev/virtualEnvs/execute/lib/python2.7/site-packages/tornado/locale.py", line 446, in translate
return self.gettext(message)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py", line 406, in ugettext
return self._fallback.ugettext(message)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py", line 407, in ugettext
return unicode(message)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2: ordinal not in range(128)
Any help?
Ah, I'm running python 2.7 btw!
1 - How do I tell Tornado that my template has its text in Portuguese so it won't go looking for a pt locale which I don't have?
This is what the set_default_locale method is for. Call tornado.locale.set_default_locale('pt') (or pt_BR, etc) once at startup to tell tornado that your template source is in Portuguese.
2 - When asking for the site with en_US locale, it loads ok but when Tornado is going to translate, it throws an encoding exception.
Remember that in Python 2, strings containing non-ascii characters need to be marked as unicode. Instead of _("Início"), use _(u"Início").

How to solve AttributeError in python active_directory?

Running the below script works for 60% of the entries from the MasterGroupList however suddenly fails with the below error. although my questions seem to be poor ou guys have been able to help me before. Any idea how I can avoid getting this error? or what is trhoughing off the script? The masterGroupList looks like:
Groups Pulled from AD
SET00 POWERUSER
SET00 USERS
SEF00 CREATORS
SEF00 USERS
...another 300 entries...
Error:
Traceback (most recent call last):
File "C:\Users\ks185278\OneDrive - NCR Corporation\Active Directory Access Scr
ipt\test.py", line 44, in <module>
print group.member
File "C:\Python27\lib\site-packages\active_directory.py", line 805, in __getat
tr__
raise AttributeError
AttributeError
Code:
from active_directory import *
import os
file = open("C:\Users\NAME\Active Directory Access Script\MasterGroupList.txt", "r")
fileAsList = file.readlines()
indexOfTitle = fileAsList.index("Groups Pulled from AD\n")
i = indexOfTitle + 1
while i <= len(fileAsList):
fileLocation = 'C:\\AD Access\\%s\\%s.txt' % (fileAsList[i][:5], fileAsList[i][:fileAsList[i].find("\n")])
#Creates the dir if it does not exist already
if not os.path.isdir(os.path.dirname(fileLocation)):
os.makedirs(os.path.dirname(fileLocation))
fileGroup = open(fileLocation, "w+")
#writes group members to the open file
group = find_group(fileAsList[i][:fileAsList[i].find("\n")])
print group.member
for group_member in group.member: #this is line 44
fileGroup.write(group_member.cn + "\n")
fileGroup.close()
i+=1
Disclaimer: I don't know python, but I know Active Directory fairly well.
If it's failing on this:
for group_member in group.member:
It could possibly mean that the group has no members.
Depending on how phython handles this, it could also mean that the group has only one member and group.member is a plain string rather than an array.
What does print group.member show?
The source code of active_directory.py is here: https://github.com/tjguk/active_directory/blob/master/active_directory.py
These are the relevant lines:
if name not in self._delegate_map:
try:
attr = getattr(self.com_object, name)
except AttributeError:
try:
attr = self.com_object.Get(name)
except:
raise AttributeError
So it looks like it just can't find the attribute you're looking up, which in this case looks like the 'member' attribute.

Created temporary file is not accessable in production

I have written a custom filefield AudioFileField. For this i created a check if a file really is a valid audiofile. To be able to do that, i use the sox commandlinetool, so i have to create a file on disk first. As sox depends on the suffix to do that validation, i needed to write my own TemporaryUploadedAudioFile, using the original suffix (instead of .upload):
class TemporaryUploadedAudioFile(TemporaryUploadedFile):
"""
A file uploaded to a temporary location (i.e. stream-to-disk).
"""
def __init__(self, name, content_type, size, charset, suffix='.upload'):
"""
The init method overrides the name creation to allow passing
an extension, so that sox is able to test the file
"""
if settings.FILE_UPLOAD_TEMP_DIR:
file = tempfile.NamedTemporaryFile(suffix=suffix,
dir=settings.FILE_UPLOAD_TEMP_DIR)
else:
file = tempfile.NamedTemporaryFile(suffix=suffix)
super(TemporaryUploadedFile, self).__init__(file, name, content_type, size, charset)
That file i use to do the audiovalidation in the AudioFileForm to_python method:
def to_python(self, data):
"""
checks that the file-upload field data contains a valid audio file.
"""
f = super(AudioFileForm, self).to_python(data)
if f is None:
return None
# get the file suffix, sox needs this to be able to test the file
suffix = os.path.splitext(data.name)[1]
# We need to get a temporary file for sox. Even if we allready have a temporary
# file, we have to create a new one ending with the correct suffix
file = TemporaryUploadedAudioFile(data.name, data.content_type, 0, data.charset,suffix = suffix)
with open(file.temporary_file_path(), 'w') as f:
f.write(data.read())
# Do the validation of the audiofile.
filetype=subprocess.Popen([sox,'--i','-t','%s'%file.temporary_file_path()], shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
filetype=filetype.communicate()[0]
filetype=filetype.replace('\n','')
if not filetype in ['wav','aiff','flac']:
raise forms.ValidationError('Not a valid audiofile (valid are: aif, flac & wav | 16 or 24 bit | 44.1 or 48 kHz)')
return data
Now to the strange things happening: this works like a charm on the development server, but as soon as i switch to apache2/mod_wsgi it stops working. sox returns an error telling me that the file is missing.
I have allready checked rights, tmp-location on the production server is /tmp, all rights are granted there (777). What else could be happening here?
mod-wsgi is known to have problems with standard output and this subprocess thingy with django. There are already lot of questions answered about this in stackoverflow.com.
A quick Google search should help you!

django cnotes not working

I have just installed django-cnotes
But it wont work.
It just throws up this error
Traceback (most recent call last):
File "/Library/Python/2.5/site-packages/django/core/servers/basehttp.py", line 279, in run
self.result = application(self.environ, self.start_response)
File "/Library/Python/2.5/site-packages/django/core/servers/basehttp.py", line 651, in __call__
return self.application(environ, start_response)
File "/Library/Python/2.5/site-packages/django/core/handlers/wsgi.py", line 245, in __call__
response = middleware_method(request, response)
File "/Library/Python/2.5/site-packages/django_cnote-0.3.4-py2.5.egg/cnotes/middleware.py", line 47, in process_response
signed_data = self.sign('cnotes', base64.urlsafe_b64encode(Pickle.dumps(cnotes.cnotes)))
PicklingError: Can't pickle <class 'django.utils.functional.__proxy__'>: attribute lookup django.utils.functional.__proxy__ failed
And it is not even in the normal django error debug page. What you see above is all there is on the screen.
And I have just used it as described on github, I just dont get it. Any one have an idea for what is causing this?
UPDATE:
Okay, so I have found something, I think.
message = _("You have successfully altered ")
message += edituser.username
cnotes.add(message)
message2 = _("You may now close ")
cnotes.add(message2)
This will cause the error. So I thought "Okay, I can only call it once per view" That would have been stupid and it was indeed not the cause.
The following code will produce no error
message = _("You have successfully altered ")
message += edituser.username
cnotes.add(message)
message2 = '_("You may now close ")'
cnotes.add(message2)
But is not because of the translation it uses that fine just 2 lines above, but it has to be something with doing another translation or something. Im lost.
It appears as though pickle is receiving an object of type django.utils.functional.__proxy__. This means either your input is weird, or there is a bug in cnotes.
If there is something wrong with your input to cnotes, you should see it if you take a look at the types of your messages (I used the manage.py shell):
>>> message = _("You have successfully altered ")
>>> message += "Bob Knoblick"
>>> type(message)
<type 'unicode'>
>>> message2 = _("You may now close ")
>>> type(message2)
<type 'unicode'>
>>>
If your types come back as anything other than unicode or str, I'd dig into your code and figure out where that other type is coming from, or ensure that it can be pickled.
If there is something wrong within cnotes, you should get the same error doing this:
cnotes.add(u'Foo')
cnotes.add(u'Bar')
cnotes.add(u'Baz')
Per the original author:
The translated string, _("You may now close ") was not ending up as a unicode string. One can use this to force unicode before sending to cnotes:
message2 = unicode(_("You may now close "))