When I have to transfer complex values (e.g. a list of dicts) through a Django template to the front end, I typically use json_script to try and prevent XSS vectors.
Recently, I started using lit-element, which has a neat way of pulling attribute values from your custom elements and providing them as properties to your component. You can say:
<my-element items="{{ serialized array of items }}"></my-element>
Then lit-element will take whatever string value is passed to the items attribute and call JSON.parse() on it, so I need a way of serializing my value to JSON.
Since that is relatively trivial in itself, my initial idea was to write a custom template filter and try to match how json_script escapes values. But then I read the source for that function and it explicitly states:
Escape all the HTML/XML special characters with their unicode escapes,
so value is safe to be output anywhere except for inside a tag attribute.
This sounds like attribute values are potentially a more serious XSS vector. So I guess my question is - how to serialize data (in Django/Python) to JSON so it's safe for use in tag attribute values?
I think the reason it says can be used anywhere apart from attributes is that this function doesn't escape speechmarks, this means you would be able to close an attribute and start a new one easily. Django provides the escapejs template tag that escapes speech marks as well.
This function is in the same source that you linked in your question. Following the approach used there should be safe.
Related
I am testing some frontend code, and I can see the code that takes input using the {{}} handlerbars, so if I entered an input = &123 , shouldn't this be converted to &123 and then stored in the server since two double mustache means the characters like '&' is escaped. When I look at the post being send to the server, it still appears as &123.
No, the HTML escaping done by {{}} only has to do with how a value is rendered into the DOM. A string entered using {{input}} is not transformed in any way by Ember, nor should it be.
In general, one does not want to HTML-escape information being held in the DB. The data in the DB should be the actual data. The HTML escaping is something that should be done, as Ember does, "on the way out" when the data is being displayed in an HTML context.
If you really want to keep HTML-escaped data in your server, then you could escape it on the server prior to saving, or perhaps in an Ember serializer. However, when retrieving the data, you'd then have to either unescape it on the server, or send it down the client as is, either unescaping it is the deserializer, or remembering that it is already escaped and putting in the DOM using {{{}}} (triple handlebars).
How can I perform sanitization on string attributes to prevent XSS? Right now my thoughts are to override my base model's save method and iterate over all the strings in the model and set all the string inputs to safe strings. Would this be a good way to approach this problem or is there a better way?
EDIT:
Problem occurs when saving a name attribute ( alert('xss')) for a person in the app. It saves it in a non-sanitized manner into the database. Then that name is loaded in our other site which does not sanitize the output and that's where the script injection occurs! I'd like to sanitize it before saving it to the DB
Handlebars automatically sanitizes strings. If you want to avoid this, you must explicitly use the triple-brace syntax:
{{{myHtmlString}}}
Rather than trying to sanitise the input, you really ought to change that other site to make sure it html-escapes the data it is presenting from the database. Even if you would "sanitise" things on the Ember side, can you guarantee there are no other vulnerabilities which allow someone to inject HTML in the database?
Always escaping anything being presented is really the only safe way to deal with XSS. If you're filtering input you are very likely to not catch every possible way of injecting unexpected input.
Using the Google CTemplate library, I have built a TemplateDictionary of params. Such a dictionary is a map of string keys to a variety of value types.
Typically, one passes CTemplate a template file wherein placeholders for each key in the dictionary are found and substituted.
In one case, though, I wish to emit the entire dictionary in JSON form, and the template language syntax doesn't appear to provide reflection such that I can write placeholders to loop over an unknown number of unknown keys in any arbitrary dictionary.
Did I miss some functionality?
If so, how can I add it?
Will I have to patch the CTemplate code? Much of what I seem to need for the job appears to be marked private i.e. for internal use only...
I've ended up hacking the CTemplate source in template_dictionary.h and template_dictionary.cc, cloning class class TemplateDictionary::DictionaryPrinter to produce a new class class TemplateDictionary::DictionaryJsonPrinter, adapting its member functions to emit JSON syntax.
I've run into a snag in my views.
Here "filtered_posts" is array of Django objects coming back from the model.
I am having a little trouble figuring out how to get as text data that I can
later pack into json instead of using serializers.serialize...
What results is that the data comes double-escaped (escaped once by serializers.serialize and a second time by json.dumps).
I can't figure out how to return the data from the db in the same way that it would come back if I were using the MySQLdb lib directly, in other words, as strings, instead of references to objects. As it stands if I take out the serializers.serialize, I get a list of these django objects, and it doesn't even list them all (abbreviates them with '...(remaining elements truncated)...'.
I don't think I should, but should I be using the __unicode__() method for this? (and if so, how should I be evoking it?)
JSONtoReturn = json.dumps({
'lowest_id': user_posts[limit - 1].id,
'user_posts': serializers.serialize("json", list(filtered_posts)),
})
The Django Rest Framework looks pretty neat. I've used Tastypie before, too.
I've also done RESTful APIs that don't include a framework. When I do that, I define toJSON methods on my objects, that return dictionaries, and cascade the call to related elements. Then I call json.dumps() on that. It's a lot of work, which is why the frameworks are worth looking at.
What you're looking for is Django Rest Framework. It handles related objects in exactly thew way you're expecting it to (you can include a nested object, like in your example, or simply have it output the PK of the related object for the key).
I've found myself unsatisfied with Django's ability to render JSON data. If I use built in serializes then database foreign key relationships are not included in the data (only the keys). Also, it seems to be impossible to include custom data in the json feed that isn't part of the model being serialized.
As a test I implemented a template that rendered some JSON for the resultset of a particular model. I was able to include/exclude whatever parts of the model I wanted and was able to include custom data as well.
The test seemed to work well and wasn't slower than the recommended serialization methods.
Are there any pitfalls to this using this method of serialization?
While it's hard to say definitively whether this method has any pitfalls, it's the method we use in production as you control everything that is serialized, even if the underlying model is changed. We've been running a high traffic application in for almost two years using this method.
Hope this helps.
One problem might be escaping metacharacters like ". Django's template system automatically escapes dangerous characters, but it's set up to do that for HTML. You should look up exactly what the template escaping does, and compare that to what's dangerous in JSON. Otherwise, you could cause XSS problems.
You could think about constructing a data structure of dicts and lists, and then running a JSON serializer on that, rather than directly on your database model.
I don't understand why you see the choice as being either 'use Django serializers' or 'write JSON in templates'. The middle way, which to my mind is much more robust and fits your use case well, is to build up your data as Python lists/dictionaries and then simply use simplejson.dumps() to convert it to a JSON string.
We use this method to get custom JSON format consumed by datatables.net
It was the easiest method we find to accomplish this task and it looks very fine with no problems so far.
You can find details here: http://datatables.net/development/server-side/django
So far, generating JSON from templates, we've run into the need to escape newlines. Looking at doing simplejson.dumps() next.