Latin1/UTF-8 Encoding Problems in AngularJS - django

I have a Python 2.7 Django + AngularJS app. There's an input field that feeds into the data model and the data is sent to the server using Angular's $http. When the input field contains the character "é", Django doesn't like it. When I use "★é" Django has no problem with it. It seems to me that the star character being outside the latin1 charset forces the encoding to utf-8, while when the only non-latin character is "é", Angular sends the data as latin1, which confuses my python code.
The error message from Django is:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 0: invalid continuation byte
Telling the simplejson.loads() function on the server to read the data using the ISO-8859-1 (latin1) encoding worked fine when my input string contained just the é in it and no star, so that proves that the data coming from the browser is latin1 unless forced to utf-8 by non-latin1 characters, like the star.
Is there a way to tell Angular to always send data using utf-8?
The Angular code that sends the data to the server:
$http({
url: $scope.dataUrl,
method: 'POST',
data: JSON.stringify({recipe: recipe}),
headers: {'Content-Type': 'application/json'}
}).success(...).error(...);
The Django code that reads the data:
recipe = simplejson.loads(request.raw_post_data)['recipe']

I found one way that works, using the transformRequest config parameter.
transformRequest: function (data, headersGetter) {
return encode_utf8(JSON.stringify(data));
}
function encode_utf8(s) {
return unescape(encodeURIComponent(s));
}
I'm using the encode function found and explained at http://ecmanaut.blogspot.com/2006/07/encoding-decoding-utf8-in-javascript.html and the JSON library found at http://www.JSON.org/json2.js.

Related

How to have my path and query parameters from my Post webservices be in UTF-8 encoding and display Chinese character correctly in the backend in Java

I have this post webservice that takes in the query parameter that we make a string called newStationName:
#POST
#Path("/station/{stationid}/")
public void setStationOptions (#PathParam("stationid")Integer stationID,
#QueryParam("stationname") String newStationName,
)
{
The problem with that query parameter is that if someone passes a name for the station in Chinese, it then it shows up on the framework end in latin-1 encoding or ISO8859-1 encoding and looks like a bunch of garbled text. The way I've gotten it to display correctly is by getting the strings bytes and changing it from latin-1 to utf-8 like this:
try {
decodedNewStationName = new String(newStationName.getBytes("ISO8859-1"), "utf-8");
}
catch (UnsupportedEncodingException e) {
log.error("Can't decode newStationName");
e.printStackTrace();
}
I would like to find a global solution for this so that every time we receive a user inputted Chinese string from our web app on any webservice, we don't need to put this try catch block there as well.
I've tried playing with our tomcat and jersery server filters and encoding and that hasn't worked. I've also tried making the request and response encoding in utf-8 and that hasn't worked. I've also tried encoding the parameter of the url in utf-8, but that just sends the string back in url utf-8 that looks like this: "%C%A%D etc..." and then that needs to be decoded.
I haven't been able to find anything that has worked globally to this point but I feel that there has to be something I'm missing.
I have also edited the Connectors in the server.xml file to have their URI Encoding to UTF-8 as well as the server.xml file encoding itself in utf-8.
<Connector URIEncoding="UTF-8" port="8080"
protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443" />
I have also changed the encoding of the web.xml file to be in utf-8 as well, and the character encoding for the HttpServletRequest and Response are in utf-8

Django: rest api: Not recieving the complete json string when its very big

I am sending a very long json string using
#api_view(['GET'])
def sendlargedata(request):
....
return HttpResponse(json.dumps(all_graphs_data,default=str),status=200,content_type="application/json")
When i check the data in the firefox response it says
SyntaxError: JSON.parse: unterminated string at line 1 column 1048577 of the JSON data
so how to oversome any size or length restrictions and send the data and recieve
Your default doesn't work in all the cases, I believe you have to escape too.

Decoding and encoding JSON in Django

I was following some django rest framework tutorials and found some obscure codes. This snippet is from the customised user model, the project from which uses jwt for authentication.
As I commented in the snippet, I can't notice the reason Why they first encodes data and decode it again. I thought this kind of pattern is not only specific to this tutorial, but quite a general pattern. Could anyone explain me please?
def _generate_jwt_token(self):
"""
Generates a JSON Web Token that stores this user's ID and
has an expiry date set to 60 days into the future.
"""
dt = datetime.now() + timedelta(days=60)
token = jwt.encode({ #first encode here
'id': self.pk,
'exp': int(dt.strftime('%s'))
}, settings.SECRET_KEY, algorithm='HS256')
return token.decode('utf-8') #returns decoded object
“Encoding” usually refers to converting data to its binary representation (bytes).
JWT (JSON Web Token) encoding uses a specific data structure and cryptographic signing to allow secure, authenticated exchanges.
The steps to encode data as JWT are as follows :
The payload is converted to json and encoded using base64.
A header, specifying the token type (eg. jwt) and the signature algorithm to use (eg. HS256), is encoded similarly.
A signature is derived from your private key and the two previous values.
Result is obtained by joining header, payload and signature with dots. The output is a binary string.
More informations here.
Decoding it using UTF-8 transforms this binary string into an Unicode string :
>>> encoded_bin = jwt.encode({'some': 'data'}, 'secret_sig', algorithm='HS256')
>>> type(encoded_bin)
<class 'bytes'>
>>> encoded_string = encoded_bin.decode('utf-8')
>>> type(encoded_string)
<class 'str'>
Notes:
It is not always possible to decode bytes to string. Base64-encoding your data allows you to store any bytes as a text representation, but the encoded form requires more space (+33%) than it's raw representation.
A binary string is prefixed by a b in your Python interpreter (eg. b"a binary string")

How to send nested form-data using postman?

Assume I have some data as below,
{
"name":"John",
"age":30,
"cars":
{
"car_img_1":"car_img_file1",
"car_img_2":"car_img_file2",
"car_img_3":"car_img_file3"
}
}
How can I send it using POSTMAN with form-data?
NOTES
1. car_img_fileX will be the file(.jpg,.png etc types)
2. What I'd tried -->> POSTMAN Screenshot.
3. Local server builted with Django framework
Current Output
Receiving 5 different items/data instaed of Nested data--> see this Pycharm Debugger Output
Try this:
cars[0][car_img_1]:car_img_file1
cars[1][car_img_2]:car_img_file2
You can insert it in "bulk-edit" mode.
I found this answer from this problem. Edited as per your code.
Convert your Image Fields to base64Image and send it through the JSON data.
All you need to do is:
go to https://www.base64-image.de/ and convert the image to base64 format. Copy the encoded result.
Install django-extra-fields package in your project from here
In your serializer_class, import and change the image field to Base64ImageField:
serializers.py
...
from drf_extra_fields.fields import Base64ImageField
...
Now, go to your postman and send the JSON data like the following. Remember to send that encoded image in your image field in JSON.
{
"name":"John",
"age":30,
"cars":
{
"car_img_1":"<base64 encoded image>",
"car_img_2":"<base64 encoded image>",
"car_img_3":"<base64 encoded image>"
}
}

Django - Decoding MIME Header with Base64 and UTF-8

I am creating a web email interface to read IMAP accounts. I'm having problems decoding a certain email header.
I obtain the following From header (specific example from an event email):
('"=?UTF-8?B?QmVubnkgQmVuYXNzaQ==?=" <NOREPLY#NOREPLY.LOCKNLOADEVENTS.COM>', None)
I separate the first part:
=?UTF-8?B?QmVubnkgQmVuYXNzaQ==?=
According to some research, it's aparently a Base64-encoded UTF-8 header.
I tried to decode it using the Base64 decoder:
# Separate sender name from email itself
first_part = header_text[1:header_text.index('" <')]
print "First part:", first_part
import base64
decoded_first_part = base64.urlsafe_b64decode(first_part)
print decoded_first_part
But I obtain a
TypeError: Incorrect padding.
Can anybody help me figure out what's wrong?
Thank you
>>> import base64
>>> base64.decodestring('QmVubnkgQmVuYXNzaQ==')
'Benny Benassi'
But you probably want to use a proper IMAP library for doing this stuff.