Why is Django's FileField giving me an encoding error?

Why is Django's FileField giving me an encoding error? - django

I have a FileField that saves the file to my file system (Mac) for development, and retrieving this file works with its url, but I get an encoding exception when I try to read it; here's my code:
View:
def download_database(request, id):
try:
project = Project.objects.get(id=id)
with project.database.open('r') as db:
response = HttpResponse(
db.read(), content_type="application/vnd.sqlite3, application/x-sqlite3")
response['Content-Disposition'] = f'inline; filename={project.database.name}'
return response
except Project.DoesNotExist:
raise Http404
HTML template:
Download
Download
Again, the browser downloads the file correctly with the 1st version, but fails with the 2nd, with:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfb in position 106: invalid start byte
... contained in the exception trace:
Internal Server Error: /project/download_database/54553950-15e5-4ea1-999e-8a6ec2a84ffb
Traceback (most recent call last):
File "/Users/csabbey/code/survey_server/venv/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner
response = get_response(request)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/csabbey/code/survey_server/venv/lib/python3.11/site-packages/django/core/handlers/base.py", line 197, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/csabbey/code/survey_server/survey_server/views.py", line 70, in download_database
db.read(), content_type="application/vnd.sqlite3, application/x-sqlite3")
^^^^^^^^^
File "<frozen codecs>", line 322, in decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfb in position 106: invalid start byte
[15/Jan/2023 20:33:01] "GET /project/download_database/54553950-15e5-4ea1-999e-8a6ec2a84ffb HTTP/1.1" 500 90923
This reading works when reading from Google Cloud Storage, which we use for production systems, but not when reading from the development machines' file system. Why is this happening, and how can we fix it?

Open the file in binary mode:
with project.database.open('rb') as db:
...

Related

Python request to API keeps returning ZeroReturnError exception

Python 2.7.3
Calling an API from a Raspberry Pi 3, the API logs show it hits the correct endpoint and returns with a 200 status code, but the python code from the Pi spits out a huge error stack. I saw in some forums that the ZeroReturnError is always thrown meaning that there was nothing wrong, but that seems weird since I can't actually get the results of the response in an except block from the try.
My code is literally
import requests
response = requests.get(<URL I AM USING>, json={JSON I AM USING})
Not sure what to do.
Traceback (most recent call last):
File "music.py", line 13, in <module>
response = requests.get(url, json={'blah':{'blah':'*********'}})
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 60, in get
return request('get', url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 49, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 457, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 606, in send
r.content
File "/usr/lib/python2.7/dist-packages/requests/models.py", line 724, in content
self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
File "/usr/lib/python2.7/dist-packages/requests/models.py", line 653, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/usr/lib/python2.7/dist-packages/urllib3/response.py", line 256, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/usr/lib/python2.7/dist-packages/urllib3/response.py", line 186, in read
data = self._fp.read(amt)
File "/usr/lib/python2.7/httplib.py", line 602, in read
s = self.fp.read(amt)
File "/usr/lib/python2.7/socket.py", line 380, in read
data = self._sock.recv(left)
File "/usr/lib/python2.7/dist-packages/urllib3/contrib/pyopenssl.py", line 188, in recv
data = self.connection.recv(*args, **kwargs)
OpenSSL.SSL.ZeroReturnError

Some more searching brought me to think it was version issues.
Ran sudo pip install urllib3 --upgrade on the Raspberry Pi and it cleared it up.
I am getting a DependencyWarning about installing PySocks, but its working correctly now.

Getting error 'ascii' codec can't decode byte 0xc3 in position 149: ordinal not in range(128)' when rebuilding haystack index

I have an application where I have to store people's names and make them searchable. The technologies I am using are python (v2.7.6) django (v1.9.5) rest framwork. The dbms is postgresql (v9.2). Since the user names can be arabic we are using utf-8 as db encoding. For search we are using haystack (v2.4.1) with Amazon Elastic Search for indexing. The index was building fine a few days ago but now when I try to rebuild it with
python manage.py rebuild_index
it fails with the following error
'ascii' codec can't decode byte 0xc3 in position 149: ordinal not in range(128)
The full error trace is
File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 188, in handle_label
self.update_backend(label, using)
File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 233, in update_backend
do_update(backend, index, qs, start, end, total, verbosity=self.verbosity, commit=self.commit)
File "/usr/local/lib/python2.7/dist-packages/haystack/management/commands/update_index.py", line 96, in do_update
backend.update(index, current_qs, commit=commit)
File "/usr/local/lib/python2.7/dist-packages/haystack/backends/elasticsearch_backend.py", line 193, in update
bulk(self.conn, prepped_docs, index=self.index_name, doc_type='modelresult')
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 188, in bulk
for ok, item in streaming_bulk(client, actions, **kwargs):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 160, in streaming_bulk
for result in _process_bulk_chunk(client, bulk_actions, raise_on_exception, raise_on_error, **kwargs):
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/helpers/__init__.py", line 85, in _process_bulk_chunk
resp = client.bulk('\n'.join(bulk_actions) + '\n', **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/utils.py", line 69, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/client/__init__.py", line 795, in bulk
doc_type, '_bulk'), params=params, body=self._bulk_body(body))
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/transport.py", line 329, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/local/lib/python2.7/dist-packages/elasticsearch/connection/http_requests.py", line 68, in perform_request
response = self.session.request(method, url, data=body, timeout=timeout or self.timeout)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 330, in send
timeout=timeout
File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 558, in urlopen
body=body, headers=headers)
File "/usr/local/lib/python2.7/dist-packages/urllib3/connectionpool.py", line 353, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python2.7/httplib.py", line 979, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python2.7/httplib.py", line 1013, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 975, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 833, in _send_output
msg += message_body
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 149: ordinal not in range(128)
My guess is that befor we didn't have arabic characters in our database so the index was building fine but now since users have entered arabic chars the index fails to build.

If you are using the requests-aws4auth package, then you can use the following wrapper class in place of the AWS4Auth class. It encodes the headers created by AWS4Auth into byte strings thus avoiding the UnicodeDecodeError downstream.
from requests_aws4auth import AWS4Auth
class AWS4AuthEncodingFix(AWS4Auth):
def __call__(self, request):
request = super(AWS4AuthEncodingFix, self).__call__(request)
for header_name in request.headers:
self._encode_header_to_utf8(request, header_name)
return request
def _encode_header_to_utf8(self, request, header_name):
value = request.headers[header_name]
if isinstance(value, unicode):
value = value.encode('utf-8')
if isinstance(header_name, unicode):
del request.headers[header_name]
header_name = header_name.encode('utf-8')
request.headers[header_name] = value

I suspect you're correct about the arabic chars now showing up in the DB.
https://github.com/elastic/elasticsearch-py/issues/392
https://github.com/django-haystack/django-haystack/issues/1072
are also possibly related to this issue. The first link seems to have some kind of work around for it, but doesn't have a lot of detail. I suspect what the author meant with
The proper fix is to use unicode type instead of str or set the default encoding properly to (I assume) utf-8.
is that you need to check that the the machine it's running on is LANG=en_US.UTF-8 or at least some UTF-8 LANG

Elasticsearch supports different encoding so having arabic characters shouldn't be the problem.
Since you are using AWS, I will assume you also use some authorization library like requests-aws4auth.
If that is the case, notice that during authorization, some unicode headers are added, like u'x-amz-date'. That is a problem, since python's httplib perfoms the following during _send_output(): msg = "\r\n".join(self._buffer) where _buffer is a list of the HTTP headers. Having unicode headers makes msg be of <type 'unicode'> while it really should be of type str (Here is a similar issue with different auth library).
The line that raises the exception, msg += message_body raises it since python needs to decode message_body to unicode so it matches the type of msg. The exception is rised since py-elasticsearch already took care of the encoding, so we end up of encoding to unicode twice, which cause the exception (as explained here).
You may want to try to replace the auth library (for example with DavidMuller/aws-requests-auth) and see if it fixes the problem.

Django 1.9 + Python 2.7: style.css did not take effect due to UnicodeDecodeError

I am trying to follow Django tutorial (Django 1.9, Python 2.7) to implement my polls project, but style.css did not take effect (neither the color nor the backgroud image).
https://docs.djangoproject.com/en/1.9/intro/tutorial06/
When running the server, I observed the UnicodeDecodeError due to 'ascii' codec.
UnicodeDecodeError: 'ascii' codec can't decode byte 0xb0 in position 1: ordinal not in range(128)
I followed the suggestion by set system default encoding to utf8, but when server was running it threw similar error due to 'utf8'.
How to fix: "UnicodeDecodeError: 'ascii' codec can't decode byte"
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb0 in position 1: invalid start byte
So I think neither ascii nor utf8 is enough, and how can I fix it?
I saw some posts provided some suggestion like json.loads, but I have no idea how to add in my Django project.
json.loads(unicode(opener.open(...), "ISO-8859-1")),
utf8' codec can't decode byte 0x89 in position 15: invalid start byte
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6: invalid data
Issue Full Trace:
[17/Mar/2016 12:26:28] "GET /polls/ HTTP/1.1" 200 161
Traceback (most recent call last):
File "C:\Python27\lib\wsgiref\handlers.py", line 85, in run
self.result = application(self.environ, self.start_response)
File "C:\Python27\lib\site-packages\django\contrib\staticfiles\handlers.py", line 64, in __call__
return super(StaticFilesHandler, self).__call__(environ, start_response)
File "C:\Python27\lib\site-packages\django\core\handlers\wsgi.py", line 177, in __call__
response = self.get_response(request)
File "C:\Python27\lib\site-packages\django\contrib\staticfiles\handlers.py", line 54, in get_response
return self.serve(request)
File "C:\Python27\lib\site-packages\django\contrib\staticfiles\handlers.py", line 47, in serve
return serve(request, self.file_path(request.path), insecure=True)
File "C:\Python27\lib\site-packages\django\contrib\staticfiles\views.py", line 40, in serve
return static.serve(request, path, document_root=document_root, **kwargs)
File "C:\Python27\lib\site-packages\django\views\static.py", line 66, in serve content_type, encoding = mimetypes.guess_type(fullpath)
File "C:\Python27\lib\mimetypes.py", line 297, in guess_type
init()
File "C:\Python27\lib\mimetypes.py", line 358, in init db.read_windows_registry()
File "C:\Python27\lib\mimetypes.py", line 258, in read_windows_registry
for subkeyname in enum_types(hkcr):
File "C:\Python27\lib\mimetypes.py", line 249, in enum_types
ctype = ctype.encode(default_encoding) # omit in 3.x!
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb0 in position 1: invalid start byte

django IOError: request data read error from BlackBerry

My site is an Intranet and has hundreds of hits by day. The issue is that django crash some times and I received this trace back error:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 105, in get_response
response = middleware_method(request, callback, callback_args, callback_kwargs)
File "/usr/local/lib/python2.7/dist-packages/django/middleware/csrf.py", line 200, in process_view
request_csrf_token = request.POST.get('csrfmiddlewaretoken', '')
File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/wsgi.py", line 210, in _get_post
self._load_post_and_files()
File "/usr/local/lib/python2.7/dist-packages/django/http/__init__.py", line 284, in _load_post_and_files
self._post, self._files = QueryDict(self.raw_post_data, encoding=self._encoding), MultiValueDict()
File "/usr/local/lib/python2.7/dist-packages/django/http/__init__.py", line 248, in _get_raw_post_data
self._raw_post_data = self.read(content_length)
File "/usr/local/lib/python2.7/dist-packages/django/http/__init__.py", line 296, in read
return self._stream.read(*args, **kwargs)
IOError: request data read error
And the relevant information is that I have found this on debug data all times that program crash:
'HTTP_USER_AGENT': 'Mozilla/5.0 (BlackBerry; U; BlackBerry 9300; es) AppleWebKit/534.8+ (KHTML, like Gecko) Version/6.0.0.668 Mobile Safari/534.8+',
'HTTP_X_RIM_HTTPS': '1.1',
'HTTP_X_WAP_PROFILE': '"http://www.blackberry.net/go/mobile/profiles/uaprof/9300_edge/6.0.0.rdf"',
App crash in login form. Some ideas?

as you might think, this is no django error.
see https://groups.google.com/group/django-users/browse_thread/thread/946936f69c012d96
have the error myself (but IE ajax requests only, no file upload, just post data).
will add an complete answer if i ever find out how to fix this.
REF: IOError: request data read error

Problem exporting an xls file for user download with django and HttpResponse

I'm currently creating a spreadsheet using xlwt and trying to export it out as an HttpResponse in django for a user to download. My code looks like this:
response = HttpResponse(mimetype = "application/vnd.ms-excel")
response['Content-Disposition'] = 'attachment; filename = %s +".xls"' % u'Zinnia_Entries'
work_book.save(response)
return response
Which seems to be the right way to do it, but I'm getting a:
Traceback (most recent call last):
File "C:\dev\workspace-warranty\imcom\imcom\wsgiserver.py", line 1233, in communicate
req.respond()
File "C:\dev\workspace-warranty\imcom\imcom\wsgiserver.py", line 745, in respond
self.server.gateway(self).respond()
File "C:\dev\workspace-warranty\imcom\imcom\wsgiserver.py", line 1927, in respond
response = self.req.server.wsgi_app(self.env, self.start_response)
File "C:\dev\workspace-warranty\3rdparty\django\core\servers\basehttp.py", line 674, in __call__
return self.application(environ, start_response)
File "C:\dev\workspace-warranty\3rdparty\django\core\handlers\wsgi.py", line 252, in __call__
response = middleware_method(request, response)
File "C:\dev\workspace-warranty\imcom\imcom\seo_mod\middleware.py", line 33, in process_response
response.content = strip_spaces_between_tags(response.content.strip())
File "C:\dev\workspace-warranty\3rdparty\django\utils\functional.py", line 259, in wrapper
return func(*args, **kwargs)
File "C:\dev\workspace-warranty\3rdparty\django\utils\html.py", line 89, in strip_spaces_between_tags
return re.sub(r'>\s+<', '><', force_unicode(value))
File "C:\dev\workspace-warranty\3rdparty\django\utils\encoding.py", line 88, in force_unicode
raise DjangoUnicodeDecodeError(s, *e.args)
DjangoUnicodeDecodeError: 'utf8' codec can't decode byte 0xd0 in position 0: invalid continuation byte. You passed in
(I left off the rest because I get a really long line of this \xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00 kind of stuff)
Do you guys have any ideas on something that could be wrong with this? Is is it because some of my write values look like this:
work_sheet.write(r,#,information) where information isn't cast to unicode?

response['Content-Disposition'] = 'attachment; filename = %s +".xls"' % u'Zinnia_Entries'
should just be
response['Content-Disposition'] = 'attachment; filename = %s.xls' % u'Zinnia_Entries'
without quotes around .xls otherwise the output will be
u'attachment; filename = Zinnia_Entries +".xls"'
So try changing that.
But also check out this answer. It has a really helpful little function for outputing xls files.
django excel xlwt

Solved the problem. Apparently someone had put some funky middleware stuff in that was hacking apart and appending and adding, ect. ect. to the file. When it shouldn't.
Anyway, with it gone the file exports perfectly.
#Storm - Thank you for all the help!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why is Django's FileField giving me an encoding error? - django

Open the file in binary mode: with project.database.open('rb') as db: ...

Related

Python request to API keeps returning ZeroReturnError exception

Getting error 'ascii' codec can't decode byte 0xc3 in position 149: ordinal not in range(128)' when rebuilding haystack index

Django 1.9 + Python 2.7: style.css did not take effect due to UnicodeDecodeError

django IOError: request data read error from BlackBerry

Problem exporting an xls file for user download with django and HttpResponse

Categories

Resources