Decoding and encoding JSON in Django - django

I was following some django rest framework tutorials and found some obscure codes. This snippet is from the customised user model, the project from which uses jwt for authentication.
As I commented in the snippet, I can't notice the reason Why they first encodes data and decode it again. I thought this kind of pattern is not only specific to this tutorial, but quite a general pattern. Could anyone explain me please?
def _generate_jwt_token(self):
"""
Generates a JSON Web Token that stores this user's ID and
has an expiry date set to 60 days into the future.
"""
dt = datetime.now() + timedelta(days=60)
token = jwt.encode({ #first encode here
'id': self.pk,
'exp': int(dt.strftime('%s'))
}, settings.SECRET_KEY, algorithm='HS256')
return token.decode('utf-8') #returns decoded object

“Encoding” usually refers to converting data to its binary representation (bytes).
JWT (JSON Web Token) encoding uses a specific data structure and cryptographic signing to allow secure, authenticated exchanges.
The steps to encode data as JWT are as follows :
The payload is converted to json and encoded using base64.
A header, specifying the token type (eg. jwt) and the signature algorithm to use (eg. HS256), is encoded similarly.
A signature is derived from your private key and the two previous values.
Result is obtained by joining header, payload and signature with dots. The output is a binary string.
More informations here.
Decoding it using UTF-8 transforms this binary string into an Unicode string :
>>> encoded_bin = jwt.encode({'some': 'data'}, 'secret_sig', algorithm='HS256')
>>> type(encoded_bin)
<class 'bytes'>
>>> encoded_string = encoded_bin.decode('utf-8')
>>> type(encoded_string)
<class 'str'>
Notes:
It is not always possible to decode bytes to string. Base64-encoding your data allows you to store any bytes as a text representation, but the encoded form requires more space (+33%) than it's raw representation.
A binary string is prefixed by a b in your Python interpreter (eg. b"a binary string")

Related

Django JsonResponse converts numbers to string

I'm using JsonResponse(status=200, data=<python_dict_with_data>) to return from my Django backend API. However, upon inspecting the result of this, all number values get converted to strings (these values are the value portion of the dict, not key). This creates a problem in the frontend receiving this response because now I have to parse them as integers to do formatting and lightweight calculations. Is there a way to prevent this conversion when returned from Django?
Or is there a way the response is parsed correctly in the frontend? I'm using Axios library in React in the frontend.
Is there a way to prevent this conversion when returned from Django?
The keys will indeed be transformed into strings, because ints as keys are illegal in JSON. Indeed, if you use a validator like JSONLint, you will see that {1: 1} is invalid JSON, whereas { "1": 1 } is valid JSON. The Python JSON encoder will thus fallback on converting the integers to strings, to still produce valid content.
If you have to do lightweight calculations, likely using these as keys is not a good idea. For example if you have data that looks like:
{ 1: 4, 2: 5 }
you might consider restructuring the data, for example to:
{ "data": [ {"key": 1, "value": 4}, {"key": 2, "value": 5} ] }
You can also return it as HTTP response, and do parsing at the JavaScript end, but likely that will only result in more trouble.
For Decimal numbers, it will also use a string. Django uses by default the DjangoJSONEncoder [Django-doc] which:
Decimal, Promise (django.utils.functional.lazy() objects), UUID:
A string representation of the object.
If we for example encode a Decimal, we see:
>>> djenc = DjangoJSONEncoder()
>>> djenc.encode({'a': Decimal('0.25')})
'{"a": "0.25"}'
You can subclass the encoder, and resolve the Decimal for example to a float, but note that this can result in loss of precision. This is exactly why a string is used: to ensure that no digits are lossed:
from django.core.serializers.json import DjangoJSONEncoder
from decimal import Decimal
class MyDjangoJSONEncoder(DjangoJSONEncoder):
def default(self, o):
if isinstance(o, Decimal):
return float(o)
return super().default(o)
this then produces:
>>> mydjenc = MyDjangoJSONEncoder()
>>> mydjenc.encode({'a': Decimal('0.25')})
'{"a": 0.25}'
You can then use this encoder in your JsonResponse:
from decimal import Decimal
def myview(request):
# …
JsonResponse(encoder=MyDjangoJSONEncoder, data={'a': Decimal('0.25')})

Wrong encoding when retrieving get argument

I have a an url encoded with URL encoding, namely : /filebrowser/?cd=bank/fran%E7ais/essais
The problem is that if I retrieve the argument through :
path = request.GET.get('relative_h', None)
I get :
/filebrowser/?cd=bank/fran�ais/essais
instead of:
/filebrowser/?cd=bank/français/essais
or :
/filebrowser/?cd=bank/fran%E7ais/essais
Yet, %E7 does correspond to 'ç', as you can see there.
And since the %E7 is decoded with the replacement character, I can't even use urllib.parse.unquote to get my 'ç' back...
Is there a way to get the raw argument or the correctly decoded string?
Switching the request encoding to latin-1 before accessing the parameter returned the correctly decoded string for me, when running your example locally.
request.encoding = 'latin-1'
path = request.GET.get('relative_h', None)
However, I'm not able to tell you why that would be, since I would have assumed that the default encoding of utf-8 would have handled that particular character.

Latin1/UTF-8 Encoding Problems in AngularJS

I have a Python 2.7 Django + AngularJS app. There's an input field that feeds into the data model and the data is sent to the server using Angular's $http. When the input field contains the character "é", Django doesn't like it. When I use "★é" Django has no problem with it. It seems to me that the star character being outside the latin1 charset forces the encoding to utf-8, while when the only non-latin character is "é", Angular sends the data as latin1, which confuses my python code.
The error message from Django is:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 0: invalid continuation byte
Telling the simplejson.loads() function on the server to read the data using the ISO-8859-1 (latin1) encoding worked fine when my input string contained just the é in it and no star, so that proves that the data coming from the browser is latin1 unless forced to utf-8 by non-latin1 characters, like the star.
Is there a way to tell Angular to always send data using utf-8?
The Angular code that sends the data to the server:
$http({
url: $scope.dataUrl,
method: 'POST',
data: JSON.stringify({recipe: recipe}),
headers: {'Content-Type': 'application/json'}
}).success(...).error(...);
The Django code that reads the data:
recipe = simplejson.loads(request.raw_post_data)['recipe']
I found one way that works, using the transformRequest config parameter.
transformRequest: function (data, headersGetter) {
return encode_utf8(JSON.stringify(data));
}
function encode_utf8(s) {
return unescape(encodeURIComponent(s));
}
I'm using the encode function found and explained at http://ecmanaut.blogspot.com/2006/07/encoding-decoding-utf8-in-javascript.html and the JSON library found at http://www.JSON.org/json2.js.

python google ouath authentication decode and verify id_token

Well, I am trying to implement google oauth authentication with my django project.
I follow the guide here:
https://developers.google.com/accounts/docs/OAuth2Login?hl=de-DE
I have got the response from exchanging code. I got a string type json which contains multiple info like access_token, id_token, etc.
Id_token is a cryptographically-signed JSON object encoded in base 64.
I try to decode id_token with python module base64, but failed.
I also tried PyJWT, failed.
Is there any way to decode and verify it?
Know this is an old post but I found it via Google so I thought somebody else might drop in...
I ended up doing:
segments = response['id_token'].split('.')
if (len(segments) != 3):
raise Exception('Wrong number of segments in token: %s' % id_token)
b64string = segments[1]
b64string = b64string.encode('ascii')
padded = b64string + '=' * (4 - len(b64string) % 4)
padded = base64.urlsafe_b64decode(padded)
ID token(aka JSON Web Signature (JWS)) has 3 parts separated by . character:
Header.Payload.Signature
We can get each part by splitting the token:
parts = token.split(".")
Now I don't know the reason, but these parts do not have the base64 padding. Maybe because it is not enforced(see this)? And python base64 library requires it.
The padding character is =, and the padding should be added to the base64 string so that it is length is multiple of 4 characters. For example if the string is 14 characters, it should have the padding == at the end so that it is 16 characters in total.
So the formula to calculate correct padding is this:
4 - len(base64_string) % 4
After we add the right padding and decode the string:
payload = parts[1]
padded = payload + '=' * (4 - len(payload) % 4)
base64.b64decode(padded)
what we will get is a string representation of JSON object, we can convert it to JSON with:
json.loads(base64.b64decode(padded))
Finally we can put everything in a convenience function:
import base64
import json
def parse_id_token(token: str) -> dict:
parts = token.split(".")
if len(parts) != 3:
raise Exception("Incorrect id token format")
payload = parts[1]
padded = payload + '=' * (4 - len(payload) % 4)
decoded = base64.b64decode(padded)
return json.loads(decoded)
To learn more details about id token check Takahiko Kawasaki(founder of authlete.com)'s excellent article
Well, I figured out why...
I used base64.b46decode(id_token) to decode it.
However, I should split id_token by '.' and decode them separately.
So I can get header, claims and signature from id_token.
I was just too stupid for ignoring those little '.' in the string....

Django - Decoding MIME Header with Base64 and UTF-8

I am creating a web email interface to read IMAP accounts. I'm having problems decoding a certain email header.
I obtain the following From header (specific example from an event email):
('"=?UTF-8?B?QmVubnkgQmVuYXNzaQ==?=" <NOREPLY#NOREPLY.LOCKNLOADEVENTS.COM>', None)
I separate the first part:
=?UTF-8?B?QmVubnkgQmVuYXNzaQ==?=
According to some research, it's aparently a Base64-encoded UTF-8 header.
I tried to decode it using the Base64 decoder:
# Separate sender name from email itself
first_part = header_text[1:header_text.index('" <')]
print "First part:", first_part
import base64
decoded_first_part = base64.urlsafe_b64decode(first_part)
print decoded_first_part
But I obtain a
TypeError: Incorrect padding.
Can anybody help me figure out what's wrong?
Thank you
>>> import base64
>>> base64.decodestring('QmVubnkgQmVuYXNzaQ==')
'Benny Benassi'
But you probably want to use a proper IMAP library for doing this stuff.